* [PATCH 0/5] NFSv4.1 DRC rewrite version 6
@ 2009-08-27 16:07 andros
2009-08-27 16:07 ` [PATCH 1/5] nfsd41: expand solo sequence check andros
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: andros @ 2009-08-27 16:07 UTC (permalink / raw)
To: bfields; +Cc: linux-nfs, pnfs
NFSv4.1 DRC Rewrite Version 6
These patches apply against git://linux-nfs.org/~bfields/linux: for-2.6.32
branch and continue to rewrite the NFSv4.1 Sessions DRC. Besides some
bug fixes:
1) The bound on the fore channel per-session DRC size are rewritten.
Instead of just using a maximum number of slots to bound the size,
The server gives the client the number of ca_maxresponsesize_cached slots it
requests bounded by NFSD_SLOT_CACHE_SIZE, NFSD_MAX_MEM_PER_SESSION and by
nfsd_drc_max_mem. Do not allow more than NFSD_MAX_SLOTS_PER_SESSION.
This allows clients to tailor a session to usage.
For example, an I/O session (READ/WRITE/COMMIT only) can have a much smaller
ca_maxresponsesize_cached (for only WRITE/COMMIt compound responses) and a lot
larger ca_maxresponses to service a large in-flight data window.
2) the page-based DRC is replaced with a buffer based DRC with each
slot table entry (struct nfsd4_slot + cache) allocated separately.
This allocation prepares us for slot size re-negotiation via the SEQUENCE
operation target and high slot id arguments.
Testing:
NFSv4.1 mount: pynfs tests - including the SEQUENCE replay cache tests.
connectathon tests.
NFSv4.0 mount: connectathon tests.
-->Andy
0001-nfsd41-expand-solo-sequence-check.patch
0002-nfsd41-bound-forechannel-drc-size-by-memory-usage.patch
0003-nfsd41-use-session-maxreqs-for-sequence-target-and.patch
0004-nfsd41-replace-nfserr_resource-in-pure-nfs41-respon.patch
0005-nfsd41-replace-page-based-DRC-with-buffer-based-DRC.patch
^ permalink raw reply [flat|nested] 23+ messages in thread* [PATCH 1/5] nfsd41: expand solo sequence check 2009-08-27 16:07 [PATCH 0/5] NFSv4.1 DRC rewrite version 6 andros @ 2009-08-27 16:07 ` andros 2009-08-27 16:07 ` [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage andros 2009-08-27 21:42 ` [PATCH 1/5] nfsd41: expand solo sequence check J. Bruce Fields 2009-08-28 21:34 ` [PATCH 0/5] NFSv4.1 DRC rewrite version 6 J. Bruce Fields 2009-08-28 22:42 ` J. Bruce Fields 2 siblings, 2 replies; 23+ messages in thread From: andros @ 2009-08-27 16:07 UTC (permalink / raw) To: bfields; +Cc: linux-nfs, pnfs, Andy Adamson From: Andy Adamson <andros@netapp.com> Failed sequenece == solo sequence. Remove redundant sequence operation cache checks. Signed-off-by: Andy Adamson <andros@netapp.com> --- fs/nfsd/nfs4state.c | 9 --------- include/linux/nfsd/xdr4.h | 2 +- 2 files changed, 1 insertions(+), 10 deletions(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index d2a0524..ddfd36f 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -991,16 +991,10 @@ nfsd4_store_cache_entry(struct nfsd4_compoundres *resp) { struct nfsd4_cache_entry *entry = &resp->cstate.slot->sl_cache_entry; struct svc_rqst *rqstp = resp->rqstp; - struct nfsd4_compoundargs *args = rqstp->rq_argp; - struct nfsd4_op *op = &args->ops[resp->opcnt]; struct kvec *resv = &rqstp->rq_res.head[0]; dprintk("--> %s entry %p\n", __func__, entry); - /* Don't cache a failed OP_SEQUENCE. */ - if (resp->opcnt == 1 && op->opnum == OP_SEQUENCE && resp->cstate.status) - return; - nfsd4_release_respages(entry->ce_respages, entry->ce_resused); entry->ce_opcnt = resp->opcnt; entry->ce_status = resp->cstate.status; @@ -1490,9 +1484,6 @@ nfsd4_sequence(struct svc_rqst *rqstp, slot->sl_inuse = true; slot->sl_seqid = seq->seqid; slot->sl_cache_entry.ce_cachethis = seq->cachethis; - /* Always set the cache entry cachethis for solo sequence */ - if (nfsd4_is_solo_sequence(resp)) - slot->sl_cache_entry.ce_cachethis = 1; cstate->slot = slot; cstate->session = session; diff --git a/include/linux/nfsd/xdr4.h b/include/linux/nfsd/xdr4.h index 5e4beb0..3f71660 100644 --- a/include/linux/nfsd/xdr4.h +++ b/include/linux/nfsd/xdr4.h @@ -467,7 +467,7 @@ struct nfsd4_compoundres { static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp) { struct nfsd4_compoundargs *args = resp->rqstp->rq_argp; - return args->opcnt == 1; + return resp->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE; } static inline bool nfsd4_not_cached(struct nfsd4_compoundres *resp) -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage 2009-08-27 16:07 ` [PATCH 1/5] nfsd41: expand solo sequence check andros @ 2009-08-27 16:07 ` andros 2009-08-27 16:07 ` [PATCH 3/5] nfsd41: use session maxreqs for sequence target and highest slotid andros ` (2 more replies) 2009-08-27 21:42 ` [PATCH 1/5] nfsd41: expand solo sequence check J. Bruce Fields 1 sibling, 3 replies; 23+ messages in thread From: andros @ 2009-08-27 16:07 UTC (permalink / raw) To: bfields; +Cc: linux-nfs, pnfs, Andy Adamson From: Andy Adamson <andros@netapp.com> By using the requested ca_maxresponsesize_cached * ca_maxresponses to bound a forechannel drc request size, clients can tailor a session to usage. For example, an I/O session (READ/WRITE only) can have a much smaller ca_maxresponsesize_cached (for only WRITE compound responses) and a lot larger ca_maxresponses to service a large in-flight data window. Signed-off-by: Andy Adamson <andros@netapp.com> --- fs/nfsd/nfs4state.c | 60 +++++++++++++++++++++++++++++++------------ include/linux/nfsd/state.h | 8 ++++- 2 files changed, 49 insertions(+), 19 deletions(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index ddfd36f..a691139 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -414,34 +414,60 @@ gen_sessionid(struct nfsd4_session *ses) } /* - * Give the client the number of slots it requests bound by - * NFSD_MAX_SLOTS_PER_SESSION and by nfsd_drc_max_mem. + * 32 bytes of RPC header and 44 bytes of sequence operation response + * not included in NFSD_SLOT_CACHE_SIZE + * */ +#define NFSD_MIN_HDR_SEQ_SZ (32 + 44) + +/* + * Give the client the number of ca_maxresponsesize_cached slots it requests + * bounded by NFSD_SLOT_CACHE_SIZE, NFSD_MAX_MEM_PER_SESSION and by + * nfsd_drc_max_mem. Do not allow more than NFSD_MAX_SLOTS_PER_SESSION. + * + * The ca_maxresponssize_cached definition includes the size + * of the rpc header with the variable length security flavor credential + * plus verifier (32 bytes with AUTH_SYS and NULL verifier) + * as well as the encoded SEQUENCE operation response (44 bytes) + * which are not included in NFSD_SLOT_CACHE_SIZE. + * We err on the side of being a bit small when AUTH_SYS/NULL verifier + * is not used. * * If we run out of reserved DRC memory we should (up to a point) re-negotiate * active sessions and reduce their slot usage to make rooom for new * connections. For now we just fail the create session. */ -static int set_forechannel_maxreqs(struct nfsd4_channel_attrs *fchan) +static int set_forechannel_drc_size(struct nfsd4_channel_attrs *fchan) { - int mem; + int mem, size = fchan->maxresp_cached - NFSD_MIN_HDR_SEQ_SZ; - if (fchan->maxreqs < 1) + if (fchan->maxreqs < 1 || size <= 0) return nfserr_inval; - else if (fchan->maxreqs > NFSD_MAX_SLOTS_PER_SESSION) - fchan->maxreqs = NFSD_MAX_SLOTS_PER_SESSION; - mem = fchan->maxreqs * NFSD_SLOT_CACHE_SIZE; + if (size > NFSD_SLOT_CACHE_SIZE) + size = NFSD_SLOT_CACHE_SIZE; + + /* bound the maxreqs by NFSD_MAX_MEM_PER_SESSION */ + mem = fchan->maxreqs * size; + if (mem > NFSD_MAX_MEM_PER_SESSION) { + fchan->maxreqs = NFSD_MAX_MEM_PER_SESSION / size; + if (fchan->maxreqs > NFSD_MAX_SLOTS_PER_SESSION) + fchan->maxreqs = NFSD_MAX_SLOTS_PER_SESSION; + mem = fchan->maxreqs * size; + } spin_lock(&nfsd_drc_lock); - if (mem + nfsd_drc_mem_used > nfsd_drc_max_mem) - mem = ((nfsd_drc_max_mem - nfsd_drc_mem_used) / - NFSD_SLOT_CACHE_SIZE) * NFSD_SLOT_CACHE_SIZE; + /* bound the total session drc memory ussage */ + if (mem + nfsd_drc_mem_used > nfsd_drc_max_mem) { + fchan->maxreqs = (nfsd_drc_max_mem - nfsd_drc_mem_used) / size; + mem = fchan->maxreqs * size; + } nfsd_drc_mem_used += mem; spin_unlock(&nfsd_drc_lock); - fchan->maxreqs = mem / NFSD_SLOT_CACHE_SIZE; if (fchan->maxreqs == 0) - return nfserr_resource; + return nfserr_serverfault; + + fchan->maxresp_cached = size + NFSD_MIN_HDR_SEQ_SZ; return 0; } @@ -466,9 +492,6 @@ static int init_forechannel_attrs(struct svc_rqst *rqstp, fchan->maxresp_sz = maxcount; session_fchan->maxresp_sz = fchan->maxresp_sz; - session_fchan->maxresp_cached = NFSD_SLOT_CACHE_SIZE; - fchan->maxresp_cached = session_fchan->maxresp_cached; - /* Use the client's maxops if possible */ if (fchan->maxops > NFSD_MAX_OPS_PER_COMPOUND) fchan->maxops = NFSD_MAX_OPS_PER_COMPOUND; @@ -478,9 +501,12 @@ static int init_forechannel_attrs(struct svc_rqst *rqstp, * recover pages from existing sessions. For now fail session * creation. */ - status = set_forechannel_maxreqs(fchan); + status = set_forechannel_drc_size(fchan); + session_fchan->maxresp_cached = fchan->maxresp_cached; session_fchan->maxreqs = fchan->maxreqs; + + dprintk("%s status %d\n", __func__, status); return status; } diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h index fb0c404..ff0b771 100644 --- a/include/linux/nfsd/state.h +++ b/include/linux/nfsd/state.h @@ -92,13 +92,17 @@ struct nfs4_cb_conn { struct rpc_cred * cb_cred; }; -/* Maximum number of slots per session. 128 is useful for long haul TCP */ -#define NFSD_MAX_SLOTS_PER_SESSION 128 +/* Maximum number of slots per session. 160 is useful for long haul TCP */ +#define NFSD_MAX_SLOTS_PER_SESSION 160 /* Maximum number of pages per slot cache entry */ #define NFSD_PAGES_PER_SLOT 1 #define NFSD_SLOT_CACHE_SIZE PAGE_SIZE /* Maximum number of operations per session compound */ #define NFSD_MAX_OPS_PER_COMPOUND 16 +/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */ +#define NFSD_CACHE_SIZE_SLOTS_PER_SESSION 32 +#define NFSD_MAX_MEM_PER_SESSION \ + (NFSD_CACHE_SIZE_SLOTS_PER_SESSION * NFSD_SLOT_CACHE_SIZE) struct nfsd4_cache_entry { __be32 ce_status; -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 3/5] nfsd41: use session maxreqs for sequence target and highest slotid 2009-08-27 16:07 ` [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage andros @ 2009-08-27 16:07 ` andros 2009-08-27 16:07 ` [PATCH 4/5] nfsd41: replace nfserr_resource in pure nfs41 responses andros 2009-08-27 17:12 ` [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage William A. (Andy) Adamson 2009-08-28 20:41 ` J. Bruce Fields 2 siblings, 1 reply; 23+ messages in thread From: andros @ 2009-08-27 16:07 UTC (permalink / raw) To: bfields; +Cc: linux-nfs, pnfs, Andy Adamson From: Andy Adamson <andros@netapp.com> This fixes a bug in the sequence operation reply. The sequence operation returns the highest slotid it will accept in the future in sr_highest_slotid, and the highest slotid it prefers the client to use. Since we do not re-negotiate the session slot table yet, these should both always be set to the session ca_maxrequests. Signed-off-by: Andy Adamson <andros@netapp.com> --- fs/nfsd/nfs4state.c | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index a691139..ada10ad 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1129,7 +1129,6 @@ nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp, * session inactivity timer fires and a solo sequence operation * is sent (lease renewal). */ - seq->maxslots = resp->cstate.session->se_fchannel.maxreqs; /* Either returns 0 or nfserr_retry_uncached */ status = nfsd4_enc_sequence_replay(resp->rqstp->rq_argp, resp); @@ -1493,6 +1492,11 @@ nfsd4_sequence(struct svc_rqst *rqstp, slot = &session->se_slots[seq->slotid]; dprintk("%s: slotid %d\n", __func__, seq->slotid); + /* We do not negotiate the number of slots yet, so set the + * maxslots to the session maxreqs which is used to encode + * sr_highest_slotid and the sr_target_slot id to maxslots */ + seq->maxslots = session->se_fchannel.maxreqs; + status = check_slot_seqid(seq->seqid, slot->sl_seqid, slot->sl_inuse); if (status == nfserr_replay_cache) { cstate->slot = slot; -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 4/5] nfsd41: replace nfserr_resource in pure nfs41 responses 2009-08-27 16:07 ` [PATCH 3/5] nfsd41: use session maxreqs for sequence target and highest slotid andros @ 2009-08-27 16:07 ` andros 2009-08-27 16:07 ` [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC andros 0 siblings, 1 reply; 23+ messages in thread From: andros @ 2009-08-27 16:07 UTC (permalink / raw) To: bfields; +Cc: linux-nfs, pnfs, Andy Adamson From: Andy Adamson <andros@netapp.com> nfserr_resource is not a legal error for NFSv4.1. Replace it with nfserr_serverfault for EXCHANGE_ID and CREATE_SESSION processing. We will also need to map nfserr_resource to other errors in routines shared by NFSv4.0 and NFSv4.1 Signed-off-by: Andy Adamson <andros@netapp.com> --- fs/nfsd/nfs4state.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index ada10ad..4695cec 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -515,7 +515,7 @@ alloc_init_session(struct svc_rqst *rqstp, struct nfs4_client *clp, struct nfsd4_create_session *cses) { struct nfsd4_session *new, tmp; - int idx, status = nfserr_resource, slotsize; + int idx, status = nfserr_serverfault, slotsize; memset(&tmp, 0, sizeof(tmp)); @@ -1278,7 +1278,7 @@ out_new: /* Normal case */ new = create_client(exid->clname, dname); if (new == NULL) { - status = nfserr_resource; + status = nfserr_serverfault; goto out; } -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC 2009-08-27 16:07 ` [PATCH 4/5] nfsd41: replace nfserr_resource in pure nfs41 responses andros @ 2009-08-27 16:07 ` andros 2009-08-28 21:33 ` J. Bruce Fields 0 siblings, 1 reply; 23+ messages in thread From: andros @ 2009-08-27 16:07 UTC (permalink / raw) To: bfields; +Cc: linux-nfs, pnfs, Andy Adamson From: Andy Adamson <andros@netapp.com> Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of holding nfsd pages in cache. Connectathon testing has shown that 1024 bytes for encoded compound operation responses past the sequence operation is sufficient, 512 bytes is a little too small. Set NFSD_SLOT_CACHE_SIZE to 1024. Allocate memory for the session DRC in the CREATE_SESSION operation to guarantee that the memory resource is available for caching responses. Allocate each slot individually in preparation for slot table size negotiation. Remove struct nfsd4_cache_entry and helper functions for the old page-based DRC. The iov_len calculation in nfs4svc_encode_compoundres is now always correct, clean up the nfs4svc_encode_compoundres session logic. The nfsd4_compound_state statp pointer is also not used. Remove nfsd4_set_statp(). Move useful nfsd4_cache_entry fields into nfsd4_slot. Signed-off-by: Andy Adamson <andros@netapp.com --- fs/nfsd/nfs4state.c | 207 ++++++++++++-------------------------------- fs/nfsd/nfs4xdr.c | 13 ++-- fs/nfsd/nfssvc.c | 4 - include/linux/nfsd/state.h | 27 ++---- include/linux/nfsd/xdr4.h | 5 +- 5 files changed, 74 insertions(+), 182 deletions(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 4695cec..2d72d5c 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -510,12 +510,22 @@ static int init_forechannel_attrs(struct svc_rqst *rqstp, return status; } +static void +free_session_slots(struct nfsd4_session *ses) +{ + int i; + + for (i = 0; i < ses->se_fchannel.maxreqs; i++) + kfree(ses->se_slots[i]); +} + static int alloc_init_session(struct svc_rqst *rqstp, struct nfs4_client *clp, struct nfsd4_create_session *cses) { struct nfsd4_session *new, tmp; - int idx, status = nfserr_serverfault, slotsize; + struct nfsd4_slot *sp; + int idx, status = nfserr_serverfault, slotsize, cachesize, i; memset(&tmp, 0, sizeof(tmp)); @@ -526,14 +536,23 @@ alloc_init_session(struct svc_rqst *rqstp, struct nfs4_client *clp, if (status) goto out; - /* allocate struct nfsd4_session and slot table in one piece */ - slotsize = tmp.se_fchannel.maxreqs * sizeof(struct nfsd4_slot); + /* allocate struct nfsd4_session and slot table pointers in one piece */ + slotsize = tmp.se_fchannel.maxreqs * sizeof(struct nfsd4_slot *); new = kzalloc(sizeof(*new) + slotsize, GFP_KERNEL); if (!new) goto out; memcpy(new, &tmp, sizeof(*new)); + /* allocate each struct nfsd4_slot and data cache in one piece */ + cachesize = new->se_fchannel.maxresp_cached - NFSD_MIN_HDR_SEQ_SZ; + for (i = 0; i < new->se_fchannel.maxreqs; i++) { + sp = kzalloc(sizeof(*sp) + cachesize, GFP_KERNEL); + if (!sp) + goto out_free; + new->se_slots[i] = sp; + } + new->se_client = clp; gen_sessionid(new); idx = hash_sessionid(&new->se_sessionid); @@ -550,6 +569,10 @@ alloc_init_session(struct svc_rqst *rqstp, struct nfs4_client *clp, status = nfs_ok; out: return status; +out_free: + free_session_slots(new); + kfree(new); + goto out; } /* caller must hold sessionid_lock */ @@ -592,22 +615,16 @@ release_session(struct nfsd4_session *ses) nfsd4_put_session(ses); } -static void nfsd4_release_respages(struct page **respages, short resused); - void free_session(struct kref *kref) { struct nfsd4_session *ses; - int i; ses = container_of(kref, struct nfsd4_session, se_ref); - for (i = 0; i < ses->se_fchannel.maxreqs; i++) { - struct nfsd4_cache_entry *e = &ses->se_slots[i].sl_cache_entry; - nfsd4_release_respages(e->ce_respages, e->ce_resused); - } spin_lock(&nfsd_drc_lock); nfsd_drc_mem_used -= ses->se_fchannel.maxreqs * NFSD_SLOT_CACHE_SIZE; spin_unlock(&nfsd_drc_lock); + free_session_slots(ses); kfree(ses); } @@ -964,116 +981,32 @@ out_err: return; } -void -nfsd4_set_statp(struct svc_rqst *rqstp, __be32 *statp) -{ - struct nfsd4_compoundres *resp = rqstp->rq_resp; - - resp->cstate.statp = statp; -} - -/* - * Dereference the result pages. - */ -static void -nfsd4_release_respages(struct page **respages, short resused) -{ - int i; - - dprintk("--> %s\n", __func__); - for (i = 0; i < resused; i++) { - if (!respages[i]) - continue; - put_page(respages[i]); - respages[i] = NULL; - } -} - -static void -nfsd4_copy_pages(struct page **topages, struct page **frompages, short count) -{ - int i; - - for (i = 0; i < count; i++) { - topages[i] = frompages[i]; - if (!topages[i]) - continue; - get_page(topages[i]); - } -} - /* - * Cache the reply pages up to NFSD_PAGES_PER_SLOT + 1, clearing the previous - * pages. We add a page to NFSD_PAGES_PER_SLOT for the case where the total - * length of the XDR response is less than se_fmaxresp_cached - * (NFSD_PAGES_PER_SLOT * PAGE_SIZE) but the xdr_buf pages is used for a - * of the reply (e.g. readdir). - * - * Store the base and length of the rq_req.head[0] page - * of the NFSv4.1 data, just past the rpc header. + * Cache a reply. nfsd4_check_drc_limit() has bounded the cache size. */ void nfsd4_store_cache_entry(struct nfsd4_compoundres *resp) { - struct nfsd4_cache_entry *entry = &resp->cstate.slot->sl_cache_entry; - struct svc_rqst *rqstp = resp->rqstp; - struct kvec *resv = &rqstp->rq_res.head[0]; - - dprintk("--> %s entry %p\n", __func__, entry); + struct nfsd4_slot *slot = resp->cstate.slot; + unsigned int base; - nfsd4_release_respages(entry->ce_respages, entry->ce_resused); - entry->ce_opcnt = resp->opcnt; - entry->ce_status = resp->cstate.status; + dprintk("--> %s slot %p\n", __func__, slot); - /* - * Don't need a page to cache just the sequence operation - the slot - * does this for us! - */ + slot->sl_opcnt = resp->opcnt; + slot->sl_status = resp->cstate.status; if (nfsd4_not_cached(resp)) { - entry->ce_resused = 0; - entry->ce_rpchdrlen = 0; - dprintk("%s Just cache SEQUENCE. ce_cachethis %d\n", __func__, - resp->cstate.slot->sl_cache_entry.ce_cachethis); + slot->sl_datalen = 0; return; } - entry->ce_resused = rqstp->rq_resused; - if (entry->ce_resused > NFSD_PAGES_PER_SLOT + 1) - entry->ce_resused = NFSD_PAGES_PER_SLOT + 1; - nfsd4_copy_pages(entry->ce_respages, rqstp->rq_respages, - entry->ce_resused); - entry->ce_datav.iov_base = resp->cstate.statp; - entry->ce_datav.iov_len = resv->iov_len - ((char *)resp->cstate.statp - - (char *)page_address(rqstp->rq_respages[0])); - /* Current request rpc header length*/ - entry->ce_rpchdrlen = (char *)resp->cstate.statp - - (char *)page_address(rqstp->rq_respages[0]); -} - -/* - * We keep the rpc header, but take the nfs reply from the replycache. - */ -static int -nfsd41_copy_replay_data(struct nfsd4_compoundres *resp, - struct nfsd4_cache_entry *entry) -{ - struct svc_rqst *rqstp = resp->rqstp; - struct kvec *resv = &resp->rqstp->rq_res.head[0]; - int len; - - /* Current request rpc header length*/ - len = (char *)resp->cstate.statp - - (char *)page_address(rqstp->rq_respages[0]); - if (entry->ce_datav.iov_len + len > PAGE_SIZE) { - dprintk("%s v41 cached reply too large (%Zd).\n", __func__, - entry->ce_datav.iov_len); - return 0; - } - /* copy the cached reply nfsd data past the current rpc header */ - memcpy((char *)resv->iov_base + len, entry->ce_datav.iov_base, - entry->ce_datav.iov_len); - resv->iov_len = len + entry->ce_datav.iov_len; - return 1; + slot->sl_datalen = (char *)resp->p - (char *)resp->cstate.datap; + base = (char *)resp->cstate.datap - + (char *)resp->xbuf->head[0].iov_base; + if (read_bytes_from_xdr_buf(resp->xbuf, base, slot->sl_data, + slot->sl_datalen)) + printk(KERN_WARNING + "nfsd: sessions DRC could not cache compound\n"); + return; } /* @@ -1091,14 +1024,14 @@ nfsd4_enc_sequence_replay(struct nfsd4_compoundargs *args, struct nfsd4_slot *slot = resp->cstate.slot; dprintk("--> %s resp->opcnt %d cachethis %u \n", __func__, - resp->opcnt, resp->cstate.slot->sl_cache_entry.ce_cachethis); + resp->opcnt, resp->cstate.slot->sl_cachethis); /* Encode the replayed sequence operation */ op = &args->ops[resp->opcnt - 1]; nfsd4_encode_operation(resp, op); /* Return nfserr_retry_uncached_rep in next operation. */ - if (args->opcnt > 1 && slot->sl_cache_entry.ce_cachethis == 0) { + if (args->opcnt > 1 && slot->sl_cachethis == 0) { op = &args->ops[resp->opcnt++]; op->status = nfserr_retry_uncached_rep; nfsd4_encode_operation(resp, op); @@ -1107,57 +1040,29 @@ nfsd4_enc_sequence_replay(struct nfsd4_compoundargs *args, } /* - * Keep the first page of the replay. Copy the NFSv4.1 data from the first - * cached page. Replace any futher replay pages from the cache. + * The sequence operation is not cached because we can use the slot and + * session values. */ __be32 nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp, struct nfsd4_sequence *seq) { - struct nfsd4_cache_entry *entry = &resp->cstate.slot->sl_cache_entry; + struct nfsd4_slot *slot = resp->cstate.slot; __be32 status; - dprintk("--> %s entry %p\n", __func__, entry); - - /* - * If this is just the sequence operation, we did not keep - * a page in the cache entry because we can just use the - * slot info stored in struct nfsd4_sequence that was checked - * against the slot in nfsd4_sequence(). - * - * This occurs when seq->cachethis is FALSE, or when the client - * session inactivity timer fires and a solo sequence operation - * is sent (lease renewal). - */ + dprintk("--> %s slot %p\n", __func__, slot); /* Either returns 0 or nfserr_retry_uncached */ status = nfsd4_enc_sequence_replay(resp->rqstp->rq_argp, resp); if (status == nfserr_retry_uncached_rep) return status; - if (!nfsd41_copy_replay_data(resp, entry)) { - /* - * Not enough room to use the replay rpc header, send the - * cached header. Release all the allocated result pages. - */ - svc_free_res_pages(resp->rqstp); - nfsd4_copy_pages(resp->rqstp->rq_respages, entry->ce_respages, - entry->ce_resused); - } else { - /* Release all but the first allocated result page */ - - resp->rqstp->rq_resused--; - svc_free_res_pages(resp->rqstp); - - nfsd4_copy_pages(&resp->rqstp->rq_respages[1], - &entry->ce_respages[1], - entry->ce_resused - 1); - } + /* The sequence operation has been encoded, cstate->datap set. */ + memcpy(resp->cstate.datap, slot->sl_data, slot->sl_datalen); - resp->rqstp->rq_resused = entry->ce_resused; - resp->opcnt = entry->ce_opcnt; - resp->cstate.iovlen = entry->ce_datav.iov_len + entry->ce_rpchdrlen; - status = entry->ce_status; + resp->opcnt = slot->sl_opcnt; + resp->p = resp->cstate.datap + XDR_QUADLEN(slot->sl_datalen); + status = slot->sl_status; return status; } @@ -1489,7 +1394,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, if (seq->slotid >= session->se_fchannel.maxreqs) goto out; - slot = &session->se_slots[seq->slotid]; + slot = session->se_slots[seq->slotid]; dprintk("%s: slotid %d\n", __func__, seq->slotid); /* We do not negotiate the number of slots yet, so set the @@ -1502,7 +1407,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, cstate->slot = slot; cstate->session = session; /* Return the cached reply status and set cstate->status - * for nfsd4_svc_encode_compoundres processing */ + * for nfsd4_proc_compound processing */ status = nfsd4_replay_cache_entry(resp, seq); cstate->status = nfserr_replay_cache; goto replay_cache; @@ -1513,7 +1418,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, /* Success! bump slot seqid */ slot->sl_inuse = true; slot->sl_seqid = seq->seqid; - slot->sl_cache_entry.ce_cachethis = seq->cachethis; + slot->sl_cachethis = seq->cachethis; cstate->slot = slot; cstate->session = session; diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index fdf632b..49824ea 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -3064,6 +3064,7 @@ nfsd4_encode_sequence(struct nfsd4_compoundres *resp, int nfserr, WRITE32(0); ADJUST_ARGS(); + resp->cstate.datap = p; /* DRC cache data pointer */ return 0; } @@ -3166,7 +3167,7 @@ static int nfsd4_check_drc_limit(struct nfsd4_compoundres *resp) return status; session = resp->cstate.session; - if (session == NULL || slot->sl_cache_entry.ce_cachethis == 0) + if (session == NULL || slot->sl_cachethis == 0) return status; if (resp->opcnt >= args->opcnt) @@ -3291,6 +3292,7 @@ nfs4svc_encode_compoundres(struct svc_rqst *rqstp, __be32 *p, struct nfsd4_compo /* * All that remains is to write the tag and operation count... */ + struct nfsd4_compound_state *cs = &resp->cstate; struct kvec *iov; p = resp->tagp; *p++ = htonl(resp->taglen); @@ -3304,14 +3306,11 @@ nfs4svc_encode_compoundres(struct svc_rqst *rqstp, __be32 *p, struct nfsd4_compo iov = &rqstp->rq_res.head[0]; iov->iov_len = ((char*)resp->p) - (char*)iov->iov_base; BUG_ON(iov->iov_len > PAGE_SIZE); - if (nfsd4_has_session(&resp->cstate)) { - if (resp->cstate.status == nfserr_replay_cache && - !nfsd4_not_cached(resp)) { - iov->iov_len = resp->cstate.iovlen; - } else { + if (nfsd4_has_session(cs)) { + if (cs->status != nfserr_replay_cache) { nfsd4_store_cache_entry(resp); dprintk("%s: SET SLOT STATE TO AVAILABLE\n", __func__); - resp->cstate.slot->sl_inuse = 0; + resp->cstate.slot->sl_inuse = false; } nfsd4_put_session(resp->cstate.session); } diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index d68cd05..944ef01 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -576,10 +576,6 @@ nfsd_dispatch(struct svc_rqst *rqstp, __be32 *statp) + rqstp->rq_res.head[0].iov_len; rqstp->rq_res.head[0].iov_len += sizeof(__be32); - /* NFSv4.1 DRC requires statp */ - if (rqstp->rq_vers == 4) - nfsd4_set_statp(rqstp, statp); - /* Now call the procedure handler, and encode NFS status. */ nfserr = proc->pc_func(rqstp, rqstp->rq_argp, rqstp->rq_resp); nfserr = map_new_errors(rqstp->rq_vers, nfserr); diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h index ff0b771..e745100 100644 --- a/include/linux/nfsd/state.h +++ b/include/linux/nfsd/state.h @@ -94,30 +94,23 @@ struct nfs4_cb_conn { /* Maximum number of slots per session. 160 is useful for long haul TCP */ #define NFSD_MAX_SLOTS_PER_SESSION 160 -/* Maximum number of pages per slot cache entry */ -#define NFSD_PAGES_PER_SLOT 1 -#define NFSD_SLOT_CACHE_SIZE PAGE_SIZE /* Maximum number of operations per session compound */ #define NFSD_MAX_OPS_PER_COMPOUND 16 +/* Maximum session per slot cache size */ +#define NFSD_SLOT_CACHE_SIZE 1024 /* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */ #define NFSD_CACHE_SIZE_SLOTS_PER_SESSION 32 #define NFSD_MAX_MEM_PER_SESSION \ (NFSD_CACHE_SIZE_SLOTS_PER_SESSION * NFSD_SLOT_CACHE_SIZE) -struct nfsd4_cache_entry { - __be32 ce_status; - struct kvec ce_datav; /* encoded NFSv4.1 data in rq_res.head[0] */ - struct page *ce_respages[NFSD_PAGES_PER_SLOT + 1]; - int ce_cachethis; - short ce_resused; - int ce_opcnt; - int ce_rpchdrlen; -}; - struct nfsd4_slot { - bool sl_inuse; - u32 sl_seqid; - struct nfsd4_cache_entry sl_cache_entry; + bool sl_inuse; + u32 sl_seqid; + int sl_cachethis; + int sl_opcnt; + __be32 sl_status; + u32 sl_datalen; + char sl_data[]; }; struct nfsd4_channel_attrs { @@ -159,7 +152,7 @@ struct nfsd4_session { struct nfs4_sessionid se_sessionid; struct nfsd4_channel_attrs se_fchannel; struct nfsd4_channel_attrs se_bchannel; - struct nfsd4_slot se_slots[]; /* forward channel slots */ + struct nfsd4_slot *se_slots[]; /* forward channel slots */ }; static inline void diff --git a/include/linux/nfsd/xdr4.h b/include/linux/nfsd/xdr4.h index 3f71660..73164c2 100644 --- a/include/linux/nfsd/xdr4.h +++ b/include/linux/nfsd/xdr4.h @@ -51,7 +51,7 @@ struct nfsd4_compound_state { /* For sessions DRC */ struct nfsd4_session *session; struct nfsd4_slot *slot; - __be32 *statp; + __be32 *datap; size_t iovlen; u32 minorversion; u32 status; @@ -472,8 +472,7 @@ static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp) static inline bool nfsd4_not_cached(struct nfsd4_compoundres *resp) { - return !resp->cstate.slot->sl_cache_entry.ce_cachethis || - nfsd4_is_solo_sequence(resp); + return !resp->cstate.slot->sl_cachethis || nfsd4_is_solo_sequence(resp); } #define NFS4_SVC_XDRSIZE sizeof(struct nfsd4_compoundargs) -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC 2009-08-27 16:07 ` [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC andros @ 2009-08-28 21:33 ` J. Bruce Fields 2009-08-30 23:10 ` [pnfs] " Benny Halevy 2009-09-01 13:48 ` William A. (Andy) Adamson 0 siblings, 2 replies; 23+ messages in thread From: J. Bruce Fields @ 2009-08-28 21:33 UTC (permalink / raw) To: andros; +Cc: linux-nfs, pnfs On Thu, Aug 27, 2009 at 12:07:44PM -0400, andros@netapp.com wrote: > From: Andy Adamson <andros@netapp.com> > > Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of holding nfsd > pages in cache. > > Connectathon testing has shown that 1024 bytes for encoded compound operation > responses past the sequence operation is sufficient, 512 bytes is a little too > small. Set NFSD_SLOT_CACHE_SIZE to 1024. > > Allocate memory for the session DRC in the CREATE_SESSION operation > to guarantee that the memory resource is available for caching responses. > Allocate each slot individually in preparation for slot table size negotiation. > > Remove struct nfsd4_cache_entry and helper functions for the old page-based > DRC. > > The iov_len calculation in nfs4svc_encode_compoundres is now always > correct, clean up the nfs4svc_encode_compoundres session logic. > > The nfsd4_compound_state statp pointer is also not used. > Remove nfsd4_set_statp(). > > Move useful nfsd4_cache_entry fields into nfsd4_slot. > > Signed-off-by: Andy Adamson <andros@netapp.com > --- > fs/nfsd/nfs4state.c | 207 ++++++++++++-------------------------------- > fs/nfsd/nfs4xdr.c | 13 ++-- > fs/nfsd/nfssvc.c | 4 - > include/linux/nfsd/state.h | 27 ++---- > include/linux/nfsd/xdr4.h | 5 +- > 5 files changed, 74 insertions(+), 182 deletions(-) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index 4695cec..2d72d5c 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -510,12 +510,22 @@ static int init_forechannel_attrs(struct svc_rqst *rqstp, > return status; > } > > +static void > +free_session_slots(struct nfsd4_session *ses) > +{ > + int i; > + > + for (i = 0; i < ses->se_fchannel.maxreqs; i++) > + kfree(ses->se_slots[i]); > +} > + > static int > alloc_init_session(struct svc_rqst *rqstp, struct nfs4_client *clp, > struct nfsd4_create_session *cses) > { > struct nfsd4_session *new, tmp; > - int idx, status = nfserr_serverfault, slotsize; > + struct nfsd4_slot *sp; > + int idx, status = nfserr_serverfault, slotsize, cachesize, i; Just as a style thing: that list's getting a little long. Could you keep at least "status" on a separate line? > > memset(&tmp, 0, sizeof(tmp)); > > @@ -526,14 +536,23 @@ alloc_init_session(struct svc_rqst *rqstp, struct nfs4_client *clp, > if (status) > goto out; > > - /* allocate struct nfsd4_session and slot table in one piece */ > - slotsize = tmp.se_fchannel.maxreqs * sizeof(struct nfsd4_slot); > + /* allocate struct nfsd4_session and slot table pointers in one piece */ > + slotsize = tmp.se_fchannel.maxreqs * sizeof(struct nfsd4_slot *); > new = kzalloc(sizeof(*new) + slotsize, GFP_KERNEL); I think this is OK for now, but maybe stick something like: BUILD_BUG_ON(NFSD_MAX_SLOTS_PER_SESSION * sizeof(struct nfsd4_slot) + sizeof(struct nfsd4_session) > PAGE_SIZE); in state.h just to warn anyone who wants to blindly bump up NFSD_MAX_SLOTS_PER_SESSION. (It's not really forbidden to kmalloc more than a page, but it's also not reliable, and if it becomes necessary then we'd rather find some way to code around it.) > if (!new) > goto out; > > memcpy(new, &tmp, sizeof(*new)); > > + /* allocate each struct nfsd4_slot and data cache in one piece */ > + cachesize = new->se_fchannel.maxresp_cached - NFSD_MIN_HDR_SEQ_SZ; > + for (i = 0; i < new->se_fchannel.maxreqs; i++) { > + sp = kzalloc(sizeof(*sp) + cachesize, GFP_KERNEL); > + if (!sp) > + goto out_free; > + new->se_slots[i] = sp; > + } > + > new->se_client = clp; > gen_sessionid(new); > idx = hash_sessionid(&new->se_sessionid); > @@ -550,6 +569,10 @@ alloc_init_session(struct svc_rqst *rqstp, struct nfs4_client *clp, > status = nfs_ok; > out: > return status; > +out_free: > + free_session_slots(new); > + kfree(new); > + goto out; > } > > /* caller must hold sessionid_lock */ > @@ -592,22 +615,16 @@ release_session(struct nfsd4_session *ses) > nfsd4_put_session(ses); > } > > -static void nfsd4_release_respages(struct page **respages, short resused); > - > void > free_session(struct kref *kref) > { > struct nfsd4_session *ses; > - int i; > > ses = container_of(kref, struct nfsd4_session, se_ref); > - for (i = 0; i < ses->se_fchannel.maxreqs; i++) { > - struct nfsd4_cache_entry *e = &ses->se_slots[i].sl_cache_entry; > - nfsd4_release_respages(e->ce_respages, e->ce_resused); > - } > spin_lock(&nfsd_drc_lock); > nfsd_drc_mem_used -= ses->se_fchannel.maxreqs * NFSD_SLOT_CACHE_SIZE; > spin_unlock(&nfsd_drc_lock); > + free_session_slots(ses); > kfree(ses); > } > > @@ -964,116 +981,32 @@ out_err: > return; > } > > -void > -nfsd4_set_statp(struct svc_rqst *rqstp, __be32 *statp) > -{ > - struct nfsd4_compoundres *resp = rqstp->rq_resp; > - > - resp->cstate.statp = statp; > -} > - > -/* > - * Dereference the result pages. > - */ > -static void > -nfsd4_release_respages(struct page **respages, short resused) > -{ > - int i; > - > - dprintk("--> %s\n", __func__); > - for (i = 0; i < resused; i++) { > - if (!respages[i]) > - continue; > - put_page(respages[i]); > - respages[i] = NULL; > - } > -} > - > -static void > -nfsd4_copy_pages(struct page **topages, struct page **frompages, short count) > -{ > - int i; > - > - for (i = 0; i < count; i++) { > - topages[i] = frompages[i]; > - if (!topages[i]) > - continue; > - get_page(topages[i]); > - } > -} > - > /* > - * Cache the reply pages up to NFSD_PAGES_PER_SLOT + 1, clearing the previous > - * pages. We add a page to NFSD_PAGES_PER_SLOT for the case where the total > - * length of the XDR response is less than se_fmaxresp_cached > - * (NFSD_PAGES_PER_SLOT * PAGE_SIZE) but the xdr_buf pages is used for a > - * of the reply (e.g. readdir). > - * > - * Store the base and length of the rq_req.head[0] page > - * of the NFSv4.1 data, just past the rpc header. > + * Cache a reply. nfsd4_check_drc_limit() has bounded the cache size. > */ > void > nfsd4_store_cache_entry(struct nfsd4_compoundres *resp) > { > - struct nfsd4_cache_entry *entry = &resp->cstate.slot->sl_cache_entry; > - struct svc_rqst *rqstp = resp->rqstp; > - struct kvec *resv = &rqstp->rq_res.head[0]; > - > - dprintk("--> %s entry %p\n", __func__, entry); > + struct nfsd4_slot *slot = resp->cstate.slot; > + unsigned int base; > > - nfsd4_release_respages(entry->ce_respages, entry->ce_resused); > - entry->ce_opcnt = resp->opcnt; > - entry->ce_status = resp->cstate.status; > + dprintk("--> %s slot %p\n", __func__, slot); > > - /* > - * Don't need a page to cache just the sequence operation - the slot > - * does this for us! > - */ > + slot->sl_opcnt = resp->opcnt; > + slot->sl_status = resp->cstate.status; > > if (nfsd4_not_cached(resp)) { > - entry->ce_resused = 0; > - entry->ce_rpchdrlen = 0; > - dprintk("%s Just cache SEQUENCE. ce_cachethis %d\n", __func__, > - resp->cstate.slot->sl_cache_entry.ce_cachethis); > + slot->sl_datalen = 0; > return; > } > - entry->ce_resused = rqstp->rq_resused; > - if (entry->ce_resused > NFSD_PAGES_PER_SLOT + 1) > - entry->ce_resused = NFSD_PAGES_PER_SLOT + 1; > - nfsd4_copy_pages(entry->ce_respages, rqstp->rq_respages, > - entry->ce_resused); > - entry->ce_datav.iov_base = resp->cstate.statp; > - entry->ce_datav.iov_len = resv->iov_len - ((char *)resp->cstate.statp - > - (char *)page_address(rqstp->rq_respages[0])); > - /* Current request rpc header length*/ > - entry->ce_rpchdrlen = (char *)resp->cstate.statp - > - (char *)page_address(rqstp->rq_respages[0]); > -} > - > -/* > - * We keep the rpc header, but take the nfs reply from the replycache. > - */ > -static int > -nfsd41_copy_replay_data(struct nfsd4_compoundres *resp, > - struct nfsd4_cache_entry *entry) > -{ > - struct svc_rqst *rqstp = resp->rqstp; > - struct kvec *resv = &resp->rqstp->rq_res.head[0]; > - int len; > - > - /* Current request rpc header length*/ > - len = (char *)resp->cstate.statp - > - (char *)page_address(rqstp->rq_respages[0]); > - if (entry->ce_datav.iov_len + len > PAGE_SIZE) { > - dprintk("%s v41 cached reply too large (%Zd).\n", __func__, > - entry->ce_datav.iov_len); > - return 0; > - } > - /* copy the cached reply nfsd data past the current rpc header */ > - memcpy((char *)resv->iov_base + len, entry->ce_datav.iov_base, > - entry->ce_datav.iov_len); > - resv->iov_len = len + entry->ce_datav.iov_len; > - return 1; > + slot->sl_datalen = (char *)resp->p - (char *)resp->cstate.datap; > + base = (char *)resp->cstate.datap - > + (char *)resp->xbuf->head[0].iov_base; > + if (read_bytes_from_xdr_buf(resp->xbuf, base, slot->sl_data, > + slot->sl_datalen)) > + printk(KERN_WARNING > + "nfsd: sessions DRC could not cache compound\n"); I'd make this WARN("nfsd:...") just to make it completely clear it's a kernel bug. (This case should be caught by nfsd4_check_drc_limit unless we've messed something up, right?) > + return; > } > > /* > @@ -1091,14 +1024,14 @@ nfsd4_enc_sequence_replay(struct nfsd4_compoundargs *args, > struct nfsd4_slot *slot = resp->cstate.slot; > > dprintk("--> %s resp->opcnt %d cachethis %u \n", __func__, > - resp->opcnt, resp->cstate.slot->sl_cache_entry.ce_cachethis); > + resp->opcnt, resp->cstate.slot->sl_cachethis); > > /* Encode the replayed sequence operation */ > op = &args->ops[resp->opcnt - 1]; > nfsd4_encode_operation(resp, op); > > /* Return nfserr_retry_uncached_rep in next operation. */ > - if (args->opcnt > 1 && slot->sl_cache_entry.ce_cachethis == 0) { > + if (args->opcnt > 1 && slot->sl_cachethis == 0) { > op = &args->ops[resp->opcnt++]; > op->status = nfserr_retry_uncached_rep; > nfsd4_encode_operation(resp, op); > @@ -1107,57 +1040,29 @@ nfsd4_enc_sequence_replay(struct nfsd4_compoundargs *args, > } > > /* > - * Keep the first page of the replay. Copy the NFSv4.1 data from the first > - * cached page. Replace any futher replay pages from the cache. > + * The sequence operation is not cached because we can use the slot and > + * session values. > */ > __be32 > nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp, > struct nfsd4_sequence *seq) > { > - struct nfsd4_cache_entry *entry = &resp->cstate.slot->sl_cache_entry; > + struct nfsd4_slot *slot = resp->cstate.slot; > __be32 status; > > - dprintk("--> %s entry %p\n", __func__, entry); > - > - /* > - * If this is just the sequence operation, we did not keep > - * a page in the cache entry because we can just use the > - * slot info stored in struct nfsd4_sequence that was checked > - * against the slot in nfsd4_sequence(). > - * > - * This occurs when seq->cachethis is FALSE, or when the client > - * session inactivity timer fires and a solo sequence operation > - * is sent (lease renewal). > - */ > + dprintk("--> %s slot %p\n", __func__, slot); > > /* Either returns 0 or nfserr_retry_uncached */ > status = nfsd4_enc_sequence_replay(resp->rqstp->rq_argp, resp); > if (status == nfserr_retry_uncached_rep) > return status; > > - if (!nfsd41_copy_replay_data(resp, entry)) { > - /* > - * Not enough room to use the replay rpc header, send the > - * cached header. Release all the allocated result pages. > - */ > - svc_free_res_pages(resp->rqstp); > - nfsd4_copy_pages(resp->rqstp->rq_respages, entry->ce_respages, > - entry->ce_resused); > - } else { > - /* Release all but the first allocated result page */ > - > - resp->rqstp->rq_resused--; > - svc_free_res_pages(resp->rqstp); > - > - nfsd4_copy_pages(&resp->rqstp->rq_respages[1], > - &entry->ce_respages[1], > - entry->ce_resused - 1); > - } > + /* The sequence operation has been encoded, cstate->datap set. */ > + memcpy(resp->cstate.datap, slot->sl_data, slot->sl_datalen); > > - resp->rqstp->rq_resused = entry->ce_resused; > - resp->opcnt = entry->ce_opcnt; > - resp->cstate.iovlen = entry->ce_datav.iov_len + entry->ce_rpchdrlen; > - status = entry->ce_status; > + resp->opcnt = slot->sl_opcnt; > + resp->p = resp->cstate.datap + XDR_QUADLEN(slot->sl_datalen); > + status = slot->sl_status; > > return status; > } > @@ -1489,7 +1394,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, > if (seq->slotid >= session->se_fchannel.maxreqs) > goto out; > > - slot = &session->se_slots[seq->slotid]; > + slot = session->se_slots[seq->slotid]; > dprintk("%s: slotid %d\n", __func__, seq->slotid); > > /* We do not negotiate the number of slots yet, so set the > @@ -1502,7 +1407,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, > cstate->slot = slot; > cstate->session = session; > /* Return the cached reply status and set cstate->status > - * for nfsd4_svc_encode_compoundres processing */ > + * for nfsd4_proc_compound processing */ > status = nfsd4_replay_cache_entry(resp, seq); > cstate->status = nfserr_replay_cache; > goto replay_cache; > @@ -1513,7 +1418,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, > /* Success! bump slot seqid */ > slot->sl_inuse = true; > slot->sl_seqid = seq->seqid; > - slot->sl_cache_entry.ce_cachethis = seq->cachethis; > + slot->sl_cachethis = seq->cachethis; > > cstate->slot = slot; > cstate->session = session; > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > index fdf632b..49824ea 100644 > --- a/fs/nfsd/nfs4xdr.c > +++ b/fs/nfsd/nfs4xdr.c > @@ -3064,6 +3064,7 @@ nfsd4_encode_sequence(struct nfsd4_compoundres *resp, int nfserr, > WRITE32(0); > > ADJUST_ARGS(); > + resp->cstate.datap = p; /* DRC cache data pointer */ > return 0; > } > > @@ -3166,7 +3167,7 @@ static int nfsd4_check_drc_limit(struct nfsd4_compoundres *resp) > return status; > > session = resp->cstate.session; > - if (session == NULL || slot->sl_cache_entry.ce_cachethis == 0) > + if (session == NULL || slot->sl_cachethis == 0) > return status; > > if (resp->opcnt >= args->opcnt) > @@ -3291,6 +3292,7 @@ nfs4svc_encode_compoundres(struct svc_rqst *rqstp, __be32 *p, struct nfsd4_compo > /* > * All that remains is to write the tag and operation count... > */ > + struct nfsd4_compound_state *cs = &resp->cstate; > struct kvec *iov; > p = resp->tagp; > *p++ = htonl(resp->taglen); > @@ -3304,14 +3306,11 @@ nfs4svc_encode_compoundres(struct svc_rqst *rqstp, __be32 *p, struct nfsd4_compo > iov = &rqstp->rq_res.head[0]; > iov->iov_len = ((char*)resp->p) - (char*)iov->iov_base; > BUG_ON(iov->iov_len > PAGE_SIZE); > - if (nfsd4_has_session(&resp->cstate)) { > - if (resp->cstate.status == nfserr_replay_cache && > - !nfsd4_not_cached(resp)) { > - iov->iov_len = resp->cstate.iovlen; > - } else { > + if (nfsd4_has_session(cs)) { > + if (cs->status != nfserr_replay_cache) { > nfsd4_store_cache_entry(resp); > dprintk("%s: SET SLOT STATE TO AVAILABLE\n", __func__); > - resp->cstate.slot->sl_inuse = 0; > + resp->cstate.slot->sl_inuse = false; > } > nfsd4_put_session(resp->cstate.session); > } > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c > index d68cd05..944ef01 100644 > --- a/fs/nfsd/nfssvc.c > +++ b/fs/nfsd/nfssvc.c > @@ -576,10 +576,6 @@ nfsd_dispatch(struct svc_rqst *rqstp, __be32 *statp) > + rqstp->rq_res.head[0].iov_len; > rqstp->rq_res.head[0].iov_len += sizeof(__be32); > > - /* NFSv4.1 DRC requires statp */ > - if (rqstp->rq_vers == 4) > - nfsd4_set_statp(rqstp, statp); > - > /* Now call the procedure handler, and encode NFS status. */ > nfserr = proc->pc_func(rqstp, rqstp->rq_argp, rqstp->rq_resp); > nfserr = map_new_errors(rqstp->rq_vers, nfserr); > diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h > index ff0b771..e745100 100644 > --- a/include/linux/nfsd/state.h > +++ b/include/linux/nfsd/state.h > @@ -94,30 +94,23 @@ struct nfs4_cb_conn { > > /* Maximum number of slots per session. 160 is useful for long haul TCP */ > #define NFSD_MAX_SLOTS_PER_SESSION 160 > -/* Maximum number of pages per slot cache entry */ > -#define NFSD_PAGES_PER_SLOT 1 > -#define NFSD_SLOT_CACHE_SIZE PAGE_SIZE > /* Maximum number of operations per session compound */ > #define NFSD_MAX_OPS_PER_COMPOUND 16 > +/* Maximum session per slot cache size */ > +#define NFSD_SLOT_CACHE_SIZE 1024 > /* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */ > #define NFSD_CACHE_SIZE_SLOTS_PER_SESSION 32 > #define NFSD_MAX_MEM_PER_SESSION \ > (NFSD_CACHE_SIZE_SLOTS_PER_SESSION * NFSD_SLOT_CACHE_SIZE) > > -struct nfsd4_cache_entry { > - __be32 ce_status; > - struct kvec ce_datav; /* encoded NFSv4.1 data in rq_res.head[0] */ > - struct page *ce_respages[NFSD_PAGES_PER_SLOT + 1]; > - int ce_cachethis; > - short ce_resused; > - int ce_opcnt; > - int ce_rpchdrlen; > -}; > - > struct nfsd4_slot { > - bool sl_inuse; > - u32 sl_seqid; > - struct nfsd4_cache_entry sl_cache_entry; > + bool sl_inuse; > + u32 sl_seqid; > + int sl_cachethis; > + int sl_opcnt; > + __be32 sl_status; > + u32 sl_datalen; > + char sl_data[]; Could you just move sl_inuse to the end? It'll save a few bytes in the structure (because the compiler will probably stick 3 bytes after it to align sl_seqid.) --b. > }; > > struct nfsd4_channel_attrs { > @@ -159,7 +152,7 @@ struct nfsd4_session { > struct nfs4_sessionid se_sessionid; > struct nfsd4_channel_attrs se_fchannel; > struct nfsd4_channel_attrs se_bchannel; > - struct nfsd4_slot se_slots[]; /* forward channel slots */ > + struct nfsd4_slot *se_slots[]; /* forward channel slots */ > }; > > static inline void > diff --git a/include/linux/nfsd/xdr4.h b/include/linux/nfsd/xdr4.h > index 3f71660..73164c2 100644 > --- a/include/linux/nfsd/xdr4.h > +++ b/include/linux/nfsd/xdr4.h > @@ -51,7 +51,7 @@ struct nfsd4_compound_state { > /* For sessions DRC */ > struct nfsd4_session *session; > struct nfsd4_slot *slot; > - __be32 *statp; > + __be32 *datap; > size_t iovlen; > u32 minorversion; > u32 status; > @@ -472,8 +472,7 @@ static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp) > > static inline bool nfsd4_not_cached(struct nfsd4_compoundres *resp) > { > - return !resp->cstate.slot->sl_cache_entry.ce_cachethis || > - nfsd4_is_solo_sequence(resp); > + return !resp->cstate.slot->sl_cachethis || nfsd4_is_solo_sequence(resp); > } > > #define NFS4_SVC_XDRSIZE sizeof(struct nfsd4_compoundargs) > -- > 1.6.2.5 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [pnfs] [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC 2009-08-28 21:33 ` J. Bruce Fields @ 2009-08-30 23:10 ` Benny Halevy 2009-08-31 13:08 ` J. Bruce Fields 2009-09-01 13:48 ` William A. (Andy) Adamson 1 sibling, 1 reply; 23+ messages in thread From: Benny Halevy @ 2009-08-30 23:10 UTC (permalink / raw) To: J. Bruce Fields; +Cc: andros, linux-nfs, pnfs On Aug. 29, 2009, 0:33 +0300, "J. Bruce Fields" <bfields@fieldses.org> wrote: > I think this is OK for now, but maybe stick something like: > > BUILD_BUG_ON(NFSD_MAX_SLOTS_PER_SESSION * sizeof(struct nfsd4_slot) > + sizeof(struct nfsd4_session) > PAGE_SIZE); > > in state.h just to warn anyone who wants to blindly bump up I think that the BUILD_BUG_ON should be placed in a function so it gets to compile, so it'd better be defined here and not in state.h (where it also is in the right context) Benny > NFSD_MAX_SLOTS_PER_SESSION. (It's not really forbidden to kmalloc more > than a page, but it's also not reliable, and if it becomes necessary > then we'd rather find some way to code around it.) > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [pnfs] [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC 2009-08-30 23:10 ` [pnfs] " Benny Halevy @ 2009-08-31 13:08 ` J. Bruce Fields 2009-08-31 13:43 ` William A. (Andy) Adamson 0 siblings, 1 reply; 23+ messages in thread From: J. Bruce Fields @ 2009-08-31 13:08 UTC (permalink / raw) To: Benny Halevy; +Cc: andros, linux-nfs, pnfs On Mon, Aug 31, 2009 at 02:10:55AM +0300, Benny Halevy wrote: > On Aug. 29, 2009, 0:33 +0300, "J. Bruce Fields" <bfields@fieldses.org> wrote: > > I think this is OK for now, but maybe stick something like: > > > > BUILD_BUG_ON(NFSD_MAX_SLOTS_PER_SESSION * sizeof(struct nfsd4_slot) > > + sizeof(struct nfsd4_session) > PAGE_SIZE); > > > > in state.h just to warn anyone who wants to blindly bump up > > I think that the BUILD_BUG_ON should be placed in a function so > it gets to compile, Whoops, thanks--I hadn't noticed that. > so it'd better be defined here and not in state.h > (where it also is in the right context) Sounds fine.--b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [pnfs] [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC 2009-08-31 13:08 ` J. Bruce Fields @ 2009-08-31 13:43 ` William A. (Andy) Adamson 0 siblings, 0 replies; 23+ messages in thread From: William A. (Andy) Adamson @ 2009-08-31 13:43 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Benny Halevy, linux-nfs, pnfs OK - Thanks for the review. New patches to follow.. -->Andy On Mon, Aug 31, 2009 at 9:08 AM, J. Bruce Fields<bfields@fieldses.org> = wrote: > On Mon, Aug 31, 2009 at 02:10:55AM +0300, Benny Halevy wrote: >> On Aug. 29, 2009, 0:33 +0300, "J. Bruce Fields" <bfields-uC3wQj2KruMpug/h7KTFAQ@public.gmane.org= g> wrote: >> > I think this is OK for now, but maybe stick something like: >> > >> > =A0 =A0 BUILD_BUG_ON(NFSD_MAX_SLOTS_PER_SESSION * sizeof(struct nf= sd4_slot) >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 + sizeof(struct nfsd4_sess= ion) > PAGE_SIZE); >> > >> > in state.h just to warn anyone who wants to blindly bump up >> >> I think that the BUILD_BUG_ON should be placed in a function so >> it gets to compile, > > Whoops, thanks--I hadn't noticed that. > >> so it'd better be defined here and not in state.h >> (where it also is in the right context) > > Sounds fine.--b. > _______________________________________________ > pNFS mailing list > pNFS@linux-nfs.org > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [pnfs] [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC 2009-08-28 21:33 ` J. Bruce Fields 2009-08-30 23:10 ` [pnfs] " Benny Halevy @ 2009-09-01 13:48 ` William A. (Andy) Adamson [not found] ` <89c397150909010648v4a4f5db8t87e09717e5a2c950-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 23+ messages in thread From: William A. (Andy) Adamson @ 2009-09-01 13:48 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs, pnfs On Fri, Aug 28, 2009 at 5:33 PM, J. Bruce Fields<bfields@fieldses.org> = wrote: > On Thu, Aug 27, 2009 at 12:07:44PM -0400, andros@netapp.com wrote: >> From: Andy Adamson <andros@netapp.com> >> >> Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of ho= lding nfsd >> pages in cache. >> >> Connectathon testing has shown that 1024 bytes for encoded compound = operation >> responses past the sequence operation is sufficient, 512 bytes is a = little too >> small. Set NFSD_SLOT_CACHE_SIZE to 1024. >> >> Allocate memory for the session DRC in the CREATE_SESSION operation >> to guarantee that the memory resource is available for caching respo= nses. >> Allocate each slot individually in preparation for slot table size n= egotiation. >> >> Remove struct nfsd4_cache_entry and helper functions for the old pag= e-based >> DRC. >> >> The iov_len calculation in nfs4svc_encode_compoundres is now always >> correct, clean up the nfs4svc_encode_compoundres session logic. >> >> The nfsd4_compound_state statp pointer is also not used. >> Remove nfsd4_set_statp(). >> >> Move useful nfsd4_cache_entry fields into nfsd4_slot. >> >> Signed-off-by: Andy Adamson <andros@netapp.com >> --- >> =A0fs/nfsd/nfs4state.c =A0 =A0 =A0 =A0| =A0207 ++++++++++++---------= ----------------------- >> =A0fs/nfsd/nfs4xdr.c =A0 =A0 =A0 =A0 =A0| =A0 13 ++-- >> =A0fs/nfsd/nfssvc.c =A0 =A0 =A0 =A0 =A0 | =A0 =A04 - >> =A0include/linux/nfsd/state.h | =A0 27 ++---- >> =A0include/linux/nfsd/xdr4.h =A0| =A0 =A05 +- >> =A05 files changed, 74 insertions(+), 182 deletions(-) >> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >> index 4695cec..2d72d5c 100644 >> --- a/fs/nfsd/nfs4state.c >> +++ b/fs/nfsd/nfs4state.c >> @@ -510,12 +510,22 @@ static int init_forechannel_attrs(struct svc_r= qst *rqstp, >> =A0 =A0 =A0 return status; >> =A0} >> >> +static void >> +free_session_slots(struct nfsd4_session *ses) >> +{ >> + =A0 =A0 int i; >> + >> + =A0 =A0 for (i =3D 0; i < ses->se_fchannel.maxreqs; i++) >> + =A0 =A0 =A0 =A0 =A0 =A0 kfree(ses->se_slots[i]); >> +} >> + >> =A0static int >> =A0alloc_init_session(struct svc_rqst *rqstp, struct nfs4_client *cl= p, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfsd4_create_session *cses= ) >> =A0{ >> =A0 =A0 =A0 struct nfsd4_session *new, tmp; >> - =A0 =A0 int idx, status =3D nfserr_serverfault, slotsize; >> + =A0 =A0 struct nfsd4_slot *sp; >> + =A0 =A0 int idx, status =3D nfserr_serverfault, slotsize, cachesiz= e, i; > > Just as a style thing: that list's getting a little long. =A0Could yo= u > keep at least "status" on a separate line? > >> >> =A0 =A0 =A0 memset(&tmp, 0, sizeof(tmp)); >> >> @@ -526,14 +536,23 @@ alloc_init_session(struct svc_rqst *rqstp, str= uct nfs4_client *clp, >> =A0 =A0 =A0 if (status) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; >> >> - =A0 =A0 /* allocate struct nfsd4_session and slot table in one pie= ce */ >> - =A0 =A0 slotsize =3D tmp.se_fchannel.maxreqs * sizeof(struct nfsd4= _slot); >> + =A0 =A0 /* allocate struct nfsd4_session and slot table pointers i= n one piece */ >> + =A0 =A0 slotsize =3D tmp.se_fchannel.maxreqs * sizeof(struct nfsd4= _slot *); >> =A0 =A0 =A0 new =3D kzalloc(sizeof(*new) + slotsize, GFP_KERNEL); > > I think this is OK for now, but maybe stick something like: > > =A0 =A0 =A0 =A0BUILD_BUG_ON(NFSD_MAX_SLOTS_PER_SESSION * sizeof(struc= t nfsd4_slot) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+ sizeof(struct nfsd4_= session) > PAGE_SIZE); > > in state.h just to warn anyone who wants to blindly bump up > NFSD_MAX_SLOTS_PER_SESSION. =A0(It's not really forbidden to kmalloc = more > than a page, but it's also not reliable, and if it becomes necessary > then we'd rather find some way to code around it.) > >> =A0 =A0 =A0 if (!new) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; >> >> =A0 =A0 =A0 memcpy(new, &tmp, sizeof(*new)); >> >> + =A0 =A0 /* allocate each struct nfsd4_slot and data cache in one p= iece */ >> + =A0 =A0 cachesize =3D new->se_fchannel.maxresp_cached - NFSD_MIN_H= DR_SEQ_SZ; >> + =A0 =A0 for (i =3D 0; i < new->se_fchannel.maxreqs; i++) { >> + =A0 =A0 =A0 =A0 =A0 =A0 sp =3D kzalloc(sizeof(*sp) + cachesize, GF= P_KERNEL); >> + =A0 =A0 =A0 =A0 =A0 =A0 if (!sp) >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out_free; >> + =A0 =A0 =A0 =A0 =A0 =A0 new->se_slots[i] =3D sp; >> + =A0 =A0 } >> + >> =A0 =A0 =A0 new->se_client =3D clp; >> =A0 =A0 =A0 gen_sessionid(new); >> =A0 =A0 =A0 idx =3D hash_sessionid(&new->se_sessionid); >> @@ -550,6 +569,10 @@ alloc_init_session(struct svc_rqst *rqstp, stru= ct nfs4_client *clp, >> =A0 =A0 =A0 status =3D nfs_ok; >> =A0out: >> =A0 =A0 =A0 return status; >> +out_free: >> + =A0 =A0 free_session_slots(new); >> + =A0 =A0 kfree(new); >> + =A0 =A0 goto out; >> =A0} >> >> =A0/* caller must hold sessionid_lock */ >> @@ -592,22 +615,16 @@ release_session(struct nfsd4_session *ses) >> =A0 =A0 =A0 nfsd4_put_session(ses); >> =A0} >> >> -static void nfsd4_release_respages(struct page **respages, short re= sused); >> - >> =A0void >> =A0free_session(struct kref *kref) >> =A0{ >> =A0 =A0 =A0 struct nfsd4_session *ses; >> - =A0 =A0 int i; >> >> =A0 =A0 =A0 ses =3D container_of(kref, struct nfsd4_session, se_ref)= ; >> - =A0 =A0 for (i =3D 0; i < ses->se_fchannel.maxreqs; i++) { >> - =A0 =A0 =A0 =A0 =A0 =A0 struct nfsd4_cache_entry *e =3D &ses->se_s= lots[i].sl_cache_entry; >> - =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_release_respages(e->ce_respages, e->= ce_resused); >> - =A0 =A0 } >> =A0 =A0 =A0 spin_lock(&nfsd_drc_lock); >> =A0 =A0 =A0 nfsd_drc_mem_used -=3D ses->se_fchannel.maxreqs * NFSD_S= LOT_CACHE_SIZE; >> =A0 =A0 =A0 spin_unlock(&nfsd_drc_lock); >> + =A0 =A0 free_session_slots(ses); >> =A0 =A0 =A0 kfree(ses); >> =A0} >> >> @@ -964,116 +981,32 @@ out_err: >> =A0 =A0 =A0 return; >> =A0} >> >> -void >> -nfsd4_set_statp(struct svc_rqst *rqstp, __be32 *statp) >> -{ >> - =A0 =A0 struct nfsd4_compoundres *resp =3D rqstp->rq_resp; >> - >> - =A0 =A0 resp->cstate.statp =3D statp; >> -} >> - >> -/* >> - * Dereference the result pages. >> - */ >> -static void >> -nfsd4_release_respages(struct page **respages, short resused) >> -{ >> - =A0 =A0 int i; >> - >> - =A0 =A0 dprintk("--> %s\n", __func__); >> - =A0 =A0 for (i =3D 0; i < resused; i++) { >> - =A0 =A0 =A0 =A0 =A0 =A0 if (!respages[i]) >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 continue; >> - =A0 =A0 =A0 =A0 =A0 =A0 put_page(respages[i]); >> - =A0 =A0 =A0 =A0 =A0 =A0 respages[i] =3D NULL; >> - =A0 =A0 } >> -} >> - >> -static void >> -nfsd4_copy_pages(struct page **topages, struct page **frompages, sh= ort count) >> -{ >> - =A0 =A0 int i; >> - >> - =A0 =A0 for (i =3D 0; i < count; i++) { >> - =A0 =A0 =A0 =A0 =A0 =A0 topages[i] =3D frompages[i]; >> - =A0 =A0 =A0 =A0 =A0 =A0 if (!topages[i]) >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 continue; >> - =A0 =A0 =A0 =A0 =A0 =A0 get_page(topages[i]); >> - =A0 =A0 } >> -} >> - >> =A0/* >> - * Cache the reply pages up to NFSD_PAGES_PER_SLOT + 1, clearing th= e previous >> - * pages. We add a page to NFSD_PAGES_PER_SLOT for the case where t= he total >> - * length of the XDR response is less than se_fmaxresp_cached >> - * (NFSD_PAGES_PER_SLOT * PAGE_SIZE) but the xdr_buf pages is used = for a >> - * of the reply (e.g. readdir). >> - * >> - * Store the base and length of the rq_req.head[0] page >> - * of the NFSv4.1 data, just past the rpc header. >> + * Cache a reply. nfsd4_check_drc_limit() has bounded the cache siz= e. >> =A0 */ >> =A0void >> =A0nfsd4_store_cache_entry(struct nfsd4_compoundres *resp) >> =A0{ >> - =A0 =A0 struct nfsd4_cache_entry *entry =3D &resp->cstate.slot->sl= _cache_entry; >> - =A0 =A0 struct svc_rqst *rqstp =3D resp->rqstp; >> - =A0 =A0 struct kvec *resv =3D &rqstp->rq_res.head[0]; >> - >> - =A0 =A0 dprintk("--> %s entry %p\n", __func__, entry); >> + =A0 =A0 struct nfsd4_slot *slot =3D resp->cstate.slot; >> + =A0 =A0 unsigned int base; >> >> - =A0 =A0 nfsd4_release_respages(entry->ce_respages, entry->ce_resus= ed); >> - =A0 =A0 entry->ce_opcnt =3D resp->opcnt; >> - =A0 =A0 entry->ce_status =3D resp->cstate.status; >> + =A0 =A0 dprintk("--> %s slot %p\n", __func__, slot); >> >> - =A0 =A0 /* >> - =A0 =A0 =A0* Don't need a page to cache just the sequence operatio= n - the slot >> - =A0 =A0 =A0* does this for us! >> - =A0 =A0 =A0*/ >> + =A0 =A0 slot->sl_opcnt =3D resp->opcnt; >> + =A0 =A0 slot->sl_status =3D resp->cstate.status; >> >> =A0 =A0 =A0 if (nfsd4_not_cached(resp)) { >> - =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_resused =3D 0; >> - =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_rpchdrlen =3D 0; >> - =A0 =A0 =A0 =A0 =A0 =A0 dprintk("%s Just cache SEQUENCE. ce_cachet= his %d\n", __func__, >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 resp->cstate.slot->sl_cach= e_entry.ce_cachethis); >> + =A0 =A0 =A0 =A0 =A0 =A0 slot->sl_datalen =3D 0; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; >> =A0 =A0 =A0 } >> - =A0 =A0 entry->ce_resused =3D rqstp->rq_resused; >> - =A0 =A0 if (entry->ce_resused > NFSD_PAGES_PER_SLOT + 1) >> - =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_resused =3D NFSD_PAGES_PER_SLOT = + 1; >> - =A0 =A0 nfsd4_copy_pages(entry->ce_respages, rqstp->rq_respages, >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0entry->ce_resused); >> - =A0 =A0 entry->ce_datav.iov_base =3D resp->cstate.statp; >> - =A0 =A0 entry->ce_datav.iov_len =3D resv->iov_len - ((char *)resp-= >cstate.statp - >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (char *)pa= ge_address(rqstp->rq_respages[0])); >> - =A0 =A0 /* Current request rpc header length*/ >> - =A0 =A0 entry->ce_rpchdrlen =3D (char *)resp->cstate.statp - >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (char *)pa= ge_address(rqstp->rq_respages[0]); >> -} >> - >> -/* >> - * We keep the rpc header, but take the nfs reply from the replycac= he. >> - */ >> -static int >> -nfsd41_copy_replay_data(struct nfsd4_compoundres *resp, >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct nfsd4_cache_entry *= entry) >> -{ >> - =A0 =A0 struct svc_rqst *rqstp =3D resp->rqstp; >> - =A0 =A0 struct kvec *resv =3D &resp->rqstp->rq_res.head[0]; >> - =A0 =A0 int len; >> - >> - =A0 =A0 /* Current request rpc header length*/ >> - =A0 =A0 len =3D (char *)resp->cstate.statp - >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (char *)page_address(rqstp= ->rq_respages[0]); >> - =A0 =A0 if (entry->ce_datav.iov_len + len > PAGE_SIZE) { >> - =A0 =A0 =A0 =A0 =A0 =A0 dprintk("%s v41 cached reply too large (%Z= d).\n", __func__, >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_datav.iov_len); >> - =A0 =A0 =A0 =A0 =A0 =A0 return 0; >> - =A0 =A0 } >> - =A0 =A0 /* copy the cached reply nfsd data past the current rpc he= ader */ >> - =A0 =A0 memcpy((char *)resv->iov_base + len, entry->ce_datav.iov_b= ase, >> - =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_datav.iov_len); >> - =A0 =A0 resv->iov_len =3D len + entry->ce_datav.iov_len; >> - =A0 =A0 return 1; >> + =A0 =A0 slot->sl_datalen =3D (char *)resp->p - (char *)resp->cstat= e.datap; >> + =A0 =A0 base =3D (char *)resp->cstate.datap - >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 (char *)resp->xbuf->head[0].iov_base; >> + =A0 =A0 if (read_bytes_from_xdr_buf(resp->xbuf, base, slot->sl_dat= a, >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 sl= ot->sl_datalen)) >> + =A0 =A0 =A0 =A0 =A0 =A0 printk(KERN_WARNING >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 "nfsd: sessions DRC could = not cache compound\n"); > > I'd make this WARN("nfsd:...") just to make it completely clear it's = a > kernel bug. =A0(This case should be caught by nfsd4_check_drc_limit u= nless > we've messed something up, right?) > >> + =A0 =A0 return; >> =A0} >> >> =A0/* >> @@ -1091,14 +1024,14 @@ nfsd4_enc_sequence_replay(struct nfsd4_compo= undargs *args, >> =A0 =A0 =A0 struct nfsd4_slot *slot =3D resp->cstate.slot; >> >> =A0 =A0 =A0 dprintk("--> %s resp->opcnt %d cachethis %u \n", __func_= _, >> - =A0 =A0 =A0 =A0 =A0 =A0 resp->opcnt, resp->cstate.slot->sl_cache_e= ntry.ce_cachethis); >> + =A0 =A0 =A0 =A0 =A0 =A0 resp->opcnt, resp->cstate.slot->sl_cacheth= is); >> >> =A0 =A0 =A0 /* Encode the replayed sequence operation */ >> =A0 =A0 =A0 op =3D &args->ops[resp->opcnt - 1]; >> =A0 =A0 =A0 nfsd4_encode_operation(resp, op); >> >> =A0 =A0 =A0 /* Return nfserr_retry_uncached_rep in next operation. *= / >> - =A0 =A0 if (args->opcnt > 1 && slot->sl_cache_entry.ce_cachethis =3D= =3D 0) { >> + =A0 =A0 if (args->opcnt > 1 && slot->sl_cachethis =3D=3D 0) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 op =3D &args->ops[resp->opcnt++]; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 op->status =3D nfserr_retry_uncached_rep= ; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_encode_operation(resp, op); >> @@ -1107,57 +1040,29 @@ nfsd4_enc_sequence_replay(struct nfsd4_compo= undargs *args, >> =A0} >> >> =A0/* >> - * Keep the first page of the replay. Copy the NFSv4.1 data from th= e first >> - * cached page. =A0Replace any futher replay pages from the cache. >> + * The sequence operation is not cached because we can use the slot= and >> + * session values. >> =A0 */ >> =A0__be32 >> =A0nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfsd4_sequence= *seq) >> =A0{ >> - =A0 =A0 struct nfsd4_cache_entry *entry =3D &resp->cstate.slot->sl= _cache_entry; >> + =A0 =A0 struct nfsd4_slot *slot =3D resp->cstate.slot; >> =A0 =A0 =A0 __be32 status; >> >> - =A0 =A0 dprintk("--> %s entry %p\n", __func__, entry); >> - >> - =A0 =A0 /* >> - =A0 =A0 =A0* If this is just the sequence operation, we did not ke= ep >> - =A0 =A0 =A0* a page in the cache entry because we can just use the >> - =A0 =A0 =A0* slot info stored in struct nfsd4_sequence that was ch= ecked >> - =A0 =A0 =A0* against the slot in nfsd4_sequence(). >> - =A0 =A0 =A0* >> - =A0 =A0 =A0* This occurs when seq->cachethis is FALSE, or when the= client >> - =A0 =A0 =A0* session inactivity timer fires and a solo sequence op= eration >> - =A0 =A0 =A0* is sent (lease renewal). >> - =A0 =A0 =A0*/ >> + =A0 =A0 dprintk("--> %s slot %p\n", __func__, slot); >> >> =A0 =A0 =A0 /* Either returns 0 or nfserr_retry_uncached */ >> =A0 =A0 =A0 status =3D nfsd4_enc_sequence_replay(resp->rqstp->rq_arg= p, resp); >> =A0 =A0 =A0 if (status =3D=3D nfserr_retry_uncached_rep) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 return status; >> >> - =A0 =A0 if (!nfsd41_copy_replay_data(resp, entry)) { >> - =A0 =A0 =A0 =A0 =A0 =A0 /* >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0* Not enough room to use the replay rpc= header, send the >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0* cached header. Release all the alloca= ted result pages. >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0*/ >> - =A0 =A0 =A0 =A0 =A0 =A0 svc_free_res_pages(resp->rqstp); >> - =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_copy_pages(resp->rqstp->rq_respages,= entry->ce_respages, >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_resused); >> - =A0 =A0 } else { >> - =A0 =A0 =A0 =A0 =A0 =A0 /* Release all but the first allocated res= ult page */ >> - >> - =A0 =A0 =A0 =A0 =A0 =A0 resp->rqstp->rq_resused--; >> - =A0 =A0 =A0 =A0 =A0 =A0 svc_free_res_pages(resp->rqstp); >> - >> - =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_copy_pages(&resp->rqstp->rq_respages= [1], >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0&entry-= >ce_respages[1], >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0entry->= ce_resused - 1); >> - =A0 =A0 } >> + =A0 =A0 /* The sequence operation has been encoded, cstate->datap = set. */ >> + =A0 =A0 memcpy(resp->cstate.datap, slot->sl_data, slot->sl_datalen= ); >> >> - =A0 =A0 resp->rqstp->rq_resused =3D entry->ce_resused; >> - =A0 =A0 resp->opcnt =3D entry->ce_opcnt; >> - =A0 =A0 resp->cstate.iovlen =3D entry->ce_datav.iov_len + entry->c= e_rpchdrlen; >> - =A0 =A0 status =3D entry->ce_status; >> + =A0 =A0 resp->opcnt =3D slot->sl_opcnt; >> + =A0 =A0 resp->p =3D resp->cstate.datap + XDR_QUADLEN(slot->sl_data= len); >> + =A0 =A0 status =3D slot->sl_status; >> >> =A0 =A0 =A0 return status; >> =A0} >> @@ -1489,7 +1394,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, >> =A0 =A0 =A0 if (seq->slotid >=3D session->se_fchannel.maxreqs) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; >> >> - =A0 =A0 slot =3D &session->se_slots[seq->slotid]; >> + =A0 =A0 slot =3D session->se_slots[seq->slotid]; >> =A0 =A0 =A0 dprintk("%s: slotid %d\n", __func__, seq->slotid); >> >> =A0 =A0 =A0 /* We do not negotiate the number of slots yet, so set t= he >> @@ -1502,7 +1407,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 cstate->slot =3D slot; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 cstate->session =3D session; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* Return the cached reply status and se= t cstate->status >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0* for nfsd4_svc_encode_compoundres proc= essing */ >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0* for nfsd4_proc_compound processing */ >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 status =3D nfsd4_replay_cache_entry(resp= , seq); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 cstate->status =3D nfserr_replay_cache; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto replay_cache; >> @@ -1513,7 +1418,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, >> =A0 =A0 =A0 /* Success! bump slot seqid */ >> =A0 =A0 =A0 slot->sl_inuse =3D true; >> =A0 =A0 =A0 slot->sl_seqid =3D seq->seqid; >> - =A0 =A0 slot->sl_cache_entry.ce_cachethis =3D seq->cachethis; >> + =A0 =A0 slot->sl_cachethis =3D seq->cachethis; >> >> =A0 =A0 =A0 cstate->slot =3D slot; >> =A0 =A0 =A0 cstate->session =3D session; >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c >> index fdf632b..49824ea 100644 >> --- a/fs/nfsd/nfs4xdr.c >> +++ b/fs/nfsd/nfs4xdr.c >> @@ -3064,6 +3064,7 @@ nfsd4_encode_sequence(struct nfsd4_compoundres= *resp, int nfserr, >> =A0 =A0 =A0 WRITE32(0); >> >> =A0 =A0 =A0 ADJUST_ARGS(); >> + =A0 =A0 resp->cstate.datap =3D p; /* DRC cache data pointer */ >> =A0 =A0 =A0 return 0; >> =A0} >> >> @@ -3166,7 +3167,7 @@ static int nfsd4_check_drc_limit(struct nfsd4_= compoundres *resp) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 return status; >> >> =A0 =A0 =A0 session =3D resp->cstate.session; >> - =A0 =A0 if (session =3D=3D NULL || slot->sl_cache_entry.ce_cacheth= is =3D=3D 0) >> + =A0 =A0 if (session =3D=3D NULL || slot->sl_cachethis =3D=3D 0) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 return status; >> >> =A0 =A0 =A0 if (resp->opcnt >=3D args->opcnt) >> @@ -3291,6 +3292,7 @@ nfs4svc_encode_compoundres(struct svc_rqst *rq= stp, __be32 *p, struct nfsd4_compo >> =A0 =A0 =A0 /* >> =A0 =A0 =A0 =A0* All that remains is to write the tag and operation = count... >> =A0 =A0 =A0 =A0*/ >> + =A0 =A0 struct nfsd4_compound_state *cs =3D &resp->cstate; >> =A0 =A0 =A0 struct kvec *iov; >> =A0 =A0 =A0 p =3D resp->tagp; >> =A0 =A0 =A0 *p++ =3D htonl(resp->taglen); >> @@ -3304,14 +3306,11 @@ nfs4svc_encode_compoundres(struct svc_rqst *= rqstp, __be32 *p, struct nfsd4_compo >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 iov =3D &rqstp->rq_res.head[0]; >> =A0 =A0 =A0 iov->iov_len =3D ((char*)resp->p) - (char*)iov->iov_base= ; >> =A0 =A0 =A0 BUG_ON(iov->iov_len > PAGE_SIZE); >> - =A0 =A0 if (nfsd4_has_session(&resp->cstate)) { >> - =A0 =A0 =A0 =A0 =A0 =A0 if (resp->cstate.status =3D=3D nfserr_repl= ay_cache && >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 !nfsd4_not= _cached(resp)) { >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 iov->iov_len =3D resp->cst= ate.iovlen; >> - =A0 =A0 =A0 =A0 =A0 =A0 } else { >> + =A0 =A0 if (nfsd4_has_session(cs)) { >> + =A0 =A0 =A0 =A0 =A0 =A0 if (cs->status !=3D nfserr_replay_cache) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_store_cache_entry(= resp); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 dprintk("%s: SET SLOT ST= ATE TO AVAILABLE\n", __func__); >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 resp->cstate.slot->sl_inus= e =3D 0; >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 resp->cstate.slot->sl_inus= e =3D false; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_put_session(resp->cstate.session); >> =A0 =A0 =A0 } >> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c >> index d68cd05..944ef01 100644 >> --- a/fs/nfsd/nfssvc.c >> +++ b/fs/nfsd/nfssvc.c >> @@ -576,10 +576,6 @@ nfsd_dispatch(struct svc_rqst *rqstp, __be32 *s= tatp) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 + rqstp->rq_res.head[0].iov_len; >> =A0 =A0 =A0 rqstp->rq_res.head[0].iov_len +=3D sizeof(__be32); >> >> - =A0 =A0 /* NFSv4.1 DRC requires statp */ >> - =A0 =A0 if (rqstp->rq_vers =3D=3D 4) >> - =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_set_statp(rqstp, statp); >> - >> =A0 =A0 =A0 /* Now call the procedure handler, and encode NFS status= =2E */ >> =A0 =A0 =A0 nfserr =3D proc->pc_func(rqstp, rqstp->rq_argp, rqstp->r= q_resp); >> =A0 =A0 =A0 nfserr =3D map_new_errors(rqstp->rq_vers, nfserr); >> diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h >> index ff0b771..e745100 100644 >> --- a/include/linux/nfsd/state.h >> +++ b/include/linux/nfsd/state.h >> @@ -94,30 +94,23 @@ struct nfs4_cb_conn { >> >> =A0/* Maximum number of slots per session. 160 is useful for long ha= ul TCP */ >> =A0#define NFSD_MAX_SLOTS_PER_SESSION =A0 =A0 160 >> -/* Maximum number of pages per slot cache entry */ >> -#define NFSD_PAGES_PER_SLOT =A01 >> -#define NFSD_SLOT_CACHE_SIZE =A0 =A0 =A0 =A0 PAGE_SIZE >> =A0/* Maximum number of operations per session compound */ >> =A0#define NFSD_MAX_OPS_PER_COMPOUND =A0 =A016 >> +/* Maximum =A0session per slot cache size */ >> +#define NFSD_SLOT_CACHE_SIZE =A0 =A0 =A0 =A0 1024 >> =A0/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */ >> =A0#define NFSD_CACHE_SIZE_SLOTS_PER_SESSION =A0 =A032 >> =A0#define NFSD_MAX_MEM_PER_SESSION =A0\ >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 (NFSD_CACHE_SIZE_SLOTS_PER_SESSION * NFS= D_SLOT_CACHE_SIZE) >> >> -struct nfsd4_cache_entry { >> - =A0 =A0 __be32 =A0 =A0 =A0 =A0 =A0ce_status; >> - =A0 =A0 struct kvec =A0 =A0 ce_datav; /* encoded NFSv4.1 data in r= q_res.head[0] */ >> - =A0 =A0 struct page =A0 =A0 *ce_respages[NFSD_PAGES_PER_SLOT + 1]; >> - =A0 =A0 int =A0 =A0 =A0 =A0 =A0 =A0 ce_cachethis; >> - =A0 =A0 short =A0 =A0 =A0 =A0 =A0 ce_resused; >> - =A0 =A0 int =A0 =A0 =A0 =A0 =A0 =A0 ce_opcnt; >> - =A0 =A0 int =A0 =A0 =A0 =A0 =A0 =A0 ce_rpchdrlen; >> -}; >> - >> =A0struct nfsd4_slot { >> - =A0 =A0 bool =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= sl_inuse; >> - =A0 =A0 u32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= sl_seqid; >> - =A0 =A0 struct nfsd4_cache_entry =A0 =A0 =A0 =A0sl_cache_entry; >> + =A0 =A0 bool =A0 =A0sl_inuse; >> + =A0 =A0 u32 =A0 =A0 sl_seqid; >> + =A0 =A0 int =A0 =A0 sl_cachethis; >> + =A0 =A0 int =A0 =A0 sl_opcnt; >> + =A0 =A0 __be32 =A0sl_status; >> + =A0 =A0 u32 =A0 =A0 sl_datalen; >> + =A0 =A0 char =A0 =A0sl_data[]; > > Could you just move sl_inuse to the end? =A0It'll save a few bytes in= the > structure (because the compiler will probably stick 3 bytes after it = to > align sl_seqid.) How about this? struct nfsd4_slot { - bool sl_inuse; - u32 sl_seqid; - struct nfsd4_cache_entry sl_cache_entry; + bool sl_inuse; + bool sl_cachethis; + u16 sl_opcnt; + u32 sl_seqid; + __be32 sl_status; + u32 sl_datalen; + char sl_data[]; }; -->Andy > --b. > >> =A0}; >> >> =A0struct nfsd4_channel_attrs { >> @@ -159,7 +152,7 @@ struct nfsd4_session { >> =A0 =A0 =A0 struct nfs4_sessionid =A0 se_sessionid; >> =A0 =A0 =A0 struct nfsd4_channel_attrs se_fchannel; >> =A0 =A0 =A0 struct nfsd4_channel_attrs se_bchannel; >> - =A0 =A0 struct nfsd4_slot =A0 =A0 =A0 se_slots[]; =A0 =A0 /* forwa= rd channel slots */ >> + =A0 =A0 struct nfsd4_slot =A0 =A0 =A0 *se_slots[]; =A0 =A0/* forwa= rd channel slots */ >> =A0}; >> >> =A0static inline void >> diff --git a/include/linux/nfsd/xdr4.h b/include/linux/nfsd/xdr4.h >> index 3f71660..73164c2 100644 >> --- a/include/linux/nfsd/xdr4.h >> +++ b/include/linux/nfsd/xdr4.h >> @@ -51,7 +51,7 @@ struct nfsd4_compound_state { >> =A0 =A0 =A0 /* For sessions DRC */ >> =A0 =A0 =A0 struct nfsd4_session =A0 =A0*session; >> =A0 =A0 =A0 struct nfsd4_slot =A0 =A0 =A0 *slot; >> - =A0 =A0 __be32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*statp; >> + =A0 =A0 __be32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*datap; >> =A0 =A0 =A0 size_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0iovlen; >> =A0 =A0 =A0 u32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 minorversion= ; >> =A0 =A0 =A0 u32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 status; >> @@ -472,8 +472,7 @@ static inline bool nfsd4_is_solo_sequence(struct= nfsd4_compoundres *resp) >> >> =A0static inline bool nfsd4_not_cached(struct nfsd4_compoundres *res= p) >> =A0{ >> - =A0 =A0 return !resp->cstate.slot->sl_cache_entry.ce_cachethis || >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_is_solo_sequence(res= p); >> + =A0 =A0 return !resp->cstate.slot->sl_cachethis || nfsd4_is_solo_s= equence(resp); >> =A0} >> >> =A0#define NFS4_SVC_XDRSIZE =A0 =A0 =A0 =A0 =A0 =A0 sizeof(struct nf= sd4_compoundargs) >> -- >> 1.6.2.5 >> > _______________________________________________ > pNFS mailing list > pNFS@linux-nfs.org > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs > ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <89c397150909010648v4a4f5db8t87e09717e5a2c950-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [pnfs] [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC [not found] ` <89c397150909010648v4a4f5db8t87e09717e5a2c950-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2009-09-01 15:14 ` J. Bruce Fields 2009-09-01 15:22 ` William A. (Andy) Adamson 0 siblings, 1 reply; 23+ messages in thread From: J. Bruce Fields @ 2009-09-01 15:14 UTC (permalink / raw) To: William A. (Andy) Adamson; +Cc: linux-nfs, pnfs On Tue, Sep 01, 2009 at 09:48:41AM -0400, William A. (Andy) Adamson wro= te: > On Fri, Aug 28, 2009 at 5:33 PM, J. Bruce Fields<bfields@fieldses.org= > wrote: > > On Thu, Aug 27, 2009 at 12:07:44PM -0400, andros@netapp.com wrote: > >> From: Andy Adamson <andros@netapp.com> > >> > >> Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of = holding nfsd > >> pages in cache. > >> > >> Connectathon testing has shown that 1024 bytes for encoded compoun= d operation > >> responses past the sequence operation is sufficient, 512 bytes is = a little too > >> small. Set NFSD_SLOT_CACHE_SIZE to 1024. > >> > >> Allocate memory for the session DRC in the CREATE_SESSION operatio= n > >> to guarantee that the memory resource is available for caching res= ponses. > >> Allocate each slot individually in preparation for slot table size= negotiation. > >> > >> Remove struct nfsd4_cache_entry and helper functions for the old p= age-based > >> DRC. > >> > >> The iov_len calculation in nfs4svc_encode_compoundres is now alway= s > >> correct, clean up the nfs4svc_encode_compoundres session logic. > >> > >> The nfsd4_compound_state statp pointer is also not used. > >> Remove nfsd4_set_statp(). > >> > >> Move useful nfsd4_cache_entry fields into nfsd4_slot. > >> > >> Signed-off-by: Andy Adamson <andros@netapp.com > >> --- > >> =C2=A0fs/nfsd/nfs4state.c =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0207 += +++++++++++-------------------------------- > >> =C2=A0fs/nfsd/nfs4xdr.c =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0= 13 ++-- > >> =C2=A0fs/nfsd/nfssvc.c =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0= =C2=A04 - > >> =C2=A0include/linux/nfsd/state.h | =C2=A0 27 ++---- > >> =C2=A0include/linux/nfsd/xdr4.h =C2=A0| =C2=A0 =C2=A05 +- > >> =C2=A05 files changed, 74 insertions(+), 182 deletions(-) > >> > >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > >> index 4695cec..2d72d5c 100644 > >> --- a/fs/nfsd/nfs4state.c > >> +++ b/fs/nfsd/nfs4state.c > >> @@ -510,12 +510,22 @@ static int init_forechannel_attrs(struct svc= _rqst *rqstp, > >> =C2=A0 =C2=A0 =C2=A0 return status; > >> =C2=A0} > >> > >> +static void > >> +free_session_slots(struct nfsd4_session *ses) > >> +{ > >> + =C2=A0 =C2=A0 int i; > >> + > >> + =C2=A0 =C2=A0 for (i =3D 0; i < ses->se_fchannel.maxreqs; i++) > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kfree(ses->se_slots[i]= ); > >> +} > >> + > >> =C2=A0static int > >> =C2=A0alloc_init_session(struct svc_rqst *rqstp, struct nfs4_clien= t *clp, > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0stru= ct nfsd4_create_session *cses) > >> =C2=A0{ > >> =C2=A0 =C2=A0 =C2=A0 struct nfsd4_session *new, tmp; > >> - =C2=A0 =C2=A0 int idx, status =3D nfserr_serverfault, slotsize; > >> + =C2=A0 =C2=A0 struct nfsd4_slot *sp; > >> + =C2=A0 =C2=A0 int idx, status =3D nfserr_serverfault, slotsize, = cachesize, i; > > > > Just as a style thing: that list's getting a little long. =C2=A0Cou= ld you > > keep at least "status" on a separate line? > > > >> > >> =C2=A0 =C2=A0 =C2=A0 memset(&tmp, 0, sizeof(tmp)); > >> > >> @@ -526,14 +536,23 @@ alloc_init_session(struct svc_rqst *rqstp, s= truct nfs4_client *clp, > >> =C2=A0 =C2=A0 =C2=A0 if (status) > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 goto out; > >> > >> - =C2=A0 =C2=A0 /* allocate struct nfsd4_session and slot table in= one piece */ > >> - =C2=A0 =C2=A0 slotsize =3D tmp.se_fchannel.maxreqs * sizeof(stru= ct nfsd4_slot); > >> + =C2=A0 =C2=A0 /* allocate struct nfsd4_session and slot table po= inters in one piece */ > >> + =C2=A0 =C2=A0 slotsize =3D tmp.se_fchannel.maxreqs * sizeof(stru= ct nfsd4_slot *); > >> =C2=A0 =C2=A0 =C2=A0 new =3D kzalloc(sizeof(*new) + slotsize, GFP_= KERNEL); > > > > I think this is OK for now, but maybe stick something like: > > > > =C2=A0 =C2=A0 =C2=A0 =C2=A0BUILD_BUG_ON(NFSD_MAX_SLOTS_PER_SESSION = * sizeof(struct nfsd4_slot) > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0+ sizeof(struct nfsd4_session) > PAGE_SIZE); > > > > in state.h just to warn anyone who wants to blindly bump up > > NFSD_MAX_SLOTS_PER_SESSION. =C2=A0(It's not really forbidden to kma= lloc more > > than a page, but it's also not reliable, and if it becomes necessar= y > > then we'd rather find some way to code around it.) > > > >> =C2=A0 =C2=A0 =C2=A0 if (!new) > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 goto out; > >> > >> =C2=A0 =C2=A0 =C2=A0 memcpy(new, &tmp, sizeof(*new)); > >> > >> + =C2=A0 =C2=A0 /* allocate each struct nfsd4_slot and data cache = in one piece */ > >> + =C2=A0 =C2=A0 cachesize =3D new->se_fchannel.maxresp_cached - NF= SD_MIN_HDR_SEQ_SZ; > >> + =C2=A0 =C2=A0 for (i =3D 0; i < new->se_fchannel.maxreqs; i++) { > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sp =3D kzalloc(sizeof(= *sp) + cachesize, GFP_KERNEL); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!sp) > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 goto out_free; > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 new->se_slots[i] =3D s= p; > >> + =C2=A0 =C2=A0 } > >> + > >> =C2=A0 =C2=A0 =C2=A0 new->se_client =3D clp; > >> =C2=A0 =C2=A0 =C2=A0 gen_sessionid(new); > >> =C2=A0 =C2=A0 =C2=A0 idx =3D hash_sessionid(&new->se_sessionid); > >> @@ -550,6 +569,10 @@ alloc_init_session(struct svc_rqst *rqstp, st= ruct nfs4_client *clp, > >> =C2=A0 =C2=A0 =C2=A0 status =3D nfs_ok; > >> =C2=A0out: > >> =C2=A0 =C2=A0 =C2=A0 return status; > >> +out_free: > >> + =C2=A0 =C2=A0 free_session_slots(new); > >> + =C2=A0 =C2=A0 kfree(new); > >> + =C2=A0 =C2=A0 goto out; > >> =C2=A0} > >> > >> =C2=A0/* caller must hold sessionid_lock */ > >> @@ -592,22 +615,16 @@ release_session(struct nfsd4_session *ses) > >> =C2=A0 =C2=A0 =C2=A0 nfsd4_put_session(ses); > >> =C2=A0} > >> > >> -static void nfsd4_release_respages(struct page **respages, short = resused); > >> - > >> =C2=A0void > >> =C2=A0free_session(struct kref *kref) > >> =C2=A0{ > >> =C2=A0 =C2=A0 =C2=A0 struct nfsd4_session *ses; > >> - =C2=A0 =C2=A0 int i; > >> > >> =C2=A0 =C2=A0 =C2=A0 ses =3D container_of(kref, struct nfsd4_sessi= on, se_ref); > >> - =C2=A0 =C2=A0 for (i =3D 0; i < ses->se_fchannel.maxreqs; i++) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct nfsd4_cache_ent= ry *e =3D &ses->se_slots[i].sl_cache_entry; > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 nfsd4_release_respages= (e->ce_respages, e->ce_resused); > >> - =C2=A0 =C2=A0 } > >> =C2=A0 =C2=A0 =C2=A0 spin_lock(&nfsd_drc_lock); > >> =C2=A0 =C2=A0 =C2=A0 nfsd_drc_mem_used -=3D ses->se_fchannel.maxre= qs * NFSD_SLOT_CACHE_SIZE; > >> =C2=A0 =C2=A0 =C2=A0 spin_unlock(&nfsd_drc_lock); > >> + =C2=A0 =C2=A0 free_session_slots(ses); > >> =C2=A0 =C2=A0 =C2=A0 kfree(ses); > >> =C2=A0} > >> > >> @@ -964,116 +981,32 @@ out_err: > >> =C2=A0 =C2=A0 =C2=A0 return; > >> =C2=A0} > >> > >> -void > >> -nfsd4_set_statp(struct svc_rqst *rqstp, __be32 *statp) > >> -{ > >> - =C2=A0 =C2=A0 struct nfsd4_compoundres *resp =3D rqstp->rq_resp; > >> - > >> - =C2=A0 =C2=A0 resp->cstate.statp =3D statp; > >> -} > >> - > >> -/* > >> - * Dereference the result pages. > >> - */ > >> -static void > >> -nfsd4_release_respages(struct page **respages, short resused) > >> -{ > >> - =C2=A0 =C2=A0 int i; > >> - > >> - =C2=A0 =C2=A0 dprintk("--> %s\n", __func__); > >> - =C2=A0 =C2=A0 for (i =3D 0; i < resused; i++) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!respages[i]) > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 continue; > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 put_page(respages[i]); > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 respages[i] =3D NULL; > >> - =C2=A0 =C2=A0 } > >> -} > >> - > >> -static void > >> -nfsd4_copy_pages(struct page **topages, struct page **frompages, = short count) > >> -{ > >> - =C2=A0 =C2=A0 int i; > >> - > >> - =C2=A0 =C2=A0 for (i =3D 0; i < count; i++) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 topages[i] =3D frompag= es[i]; > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!topages[i]) > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 continue; > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 get_page(topages[i]); > >> - =C2=A0 =C2=A0 } > >> -} > >> - > >> =C2=A0/* > >> - * Cache the reply pages up to NFSD_PAGES_PER_SLOT + 1, clearing = the previous > >> - * pages. We add a page to NFSD_PAGES_PER_SLOT for the case where= the total > >> - * length of the XDR response is less than se_fmaxresp_cached > >> - * (NFSD_PAGES_PER_SLOT * PAGE_SIZE) but the xdr_buf pages is use= d for a > >> - * of the reply (e.g. readdir). > >> - * > >> - * Store the base and length of the rq_req.head[0] page > >> - * of the NFSv4.1 data, just past the rpc header. > >> + * Cache a reply. nfsd4_check_drc_limit() has bounded the cache s= ize. > >> =C2=A0 */ > >> =C2=A0void > >> =C2=A0nfsd4_store_cache_entry(struct nfsd4_compoundres *resp) > >> =C2=A0{ > >> - =C2=A0 =C2=A0 struct nfsd4_cache_entry *entry =3D &resp->cstate.= slot->sl_cache_entry; > >> - =C2=A0 =C2=A0 struct svc_rqst *rqstp =3D resp->rqstp; > >> - =C2=A0 =C2=A0 struct kvec *resv =3D &rqstp->rq_res.head[0]; > >> - > >> - =C2=A0 =C2=A0 dprintk("--> %s entry %p\n", __func__, entry); > >> + =C2=A0 =C2=A0 struct nfsd4_slot *slot =3D resp->cstate.slot; > >> + =C2=A0 =C2=A0 unsigned int base; > >> > >> - =C2=A0 =C2=A0 nfsd4_release_respages(entry->ce_respages, entry->= ce_resused); > >> - =C2=A0 =C2=A0 entry->ce_opcnt =3D resp->opcnt; > >> - =C2=A0 =C2=A0 entry->ce_status =3D resp->cstate.status; > >> + =C2=A0 =C2=A0 dprintk("--> %s slot %p\n", __func__, slot); > >> > >> - =C2=A0 =C2=A0 /* > >> - =C2=A0 =C2=A0 =C2=A0* Don't need a page to cache just the sequen= ce operation - the slot > >> - =C2=A0 =C2=A0 =C2=A0* does this for us! > >> - =C2=A0 =C2=A0 =C2=A0*/ > >> + =C2=A0 =C2=A0 slot->sl_opcnt =3D resp->opcnt; > >> + =C2=A0 =C2=A0 slot->sl_status =3D resp->cstate.status; > >> > >> =C2=A0 =C2=A0 =C2=A0 if (nfsd4_not_cached(resp)) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 entry->ce_resused =3D = 0; > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 entry->ce_rpchdrlen =3D= 0; > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 dprintk("%s Just cache= SEQUENCE. ce_cachethis %d\n", __func__, > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 resp->cstate.slot->sl_cache_entry.ce_cachethis); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 slot->sl_datalen =3D 0= ; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return; > >> =C2=A0 =C2=A0 =C2=A0 } > >> - =C2=A0 =C2=A0 entry->ce_resused =3D rqstp->rq_resused; > >> - =C2=A0 =C2=A0 if (entry->ce_resused > NFSD_PAGES_PER_SLOT + 1) > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 entry->ce_resused =3D = NFSD_PAGES_PER_SLOT + 1; > >> - =C2=A0 =C2=A0 nfsd4_copy_pages(entry->ce_respages, rqstp->rq_res= pages, > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0entry->ce_resused); > >> - =C2=A0 =C2=A0 entry->ce_datav.iov_base =3D resp->cstate.statp; > >> - =C2=A0 =C2=A0 entry->ce_datav.iov_len =3D resv->iov_len - ((char= *)resp->cstate.statp - > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (char *)page_address(rqstp->rq_respages= [0])); > >> - =C2=A0 =C2=A0 /* Current request rpc header length*/ > >> - =C2=A0 =C2=A0 entry->ce_rpchdrlen =3D (char *)resp->cstate.statp= - > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (char *)page_address(rqstp->rq_respages= [0]); > >> -} > >> - > >> -/* > >> - * We keep the rpc header, but take the nfs reply from the replyc= ache. > >> - */ > >> -static int > >> -nfsd41_copy_replay_data(struct nfsd4_compoundres *resp, > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 struct nfsd4_cache_entry *entry) > >> -{ > >> - =C2=A0 =C2=A0 struct svc_rqst *rqstp =3D resp->rqstp; > >> - =C2=A0 =C2=A0 struct kvec *resv =3D &resp->rqstp->rq_res.head[0]= ; > >> - =C2=A0 =C2=A0 int len; > >> - > >> - =C2=A0 =C2=A0 /* Current request rpc header length*/ > >> - =C2=A0 =C2=A0 len =3D (char *)resp->cstate.statp - > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (char *)page_address(rqstp->rq_respages[0]); > >> - =C2=A0 =C2=A0 if (entry->ce_datav.iov_len + len > PAGE_SIZE) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 dprintk("%s v41 cached= reply too large (%Zd).\n", __func__, > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 entry->ce_datav.iov_len); > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return 0; > >> - =C2=A0 =C2=A0 } > >> - =C2=A0 =C2=A0 /* copy the cached reply nfsd data past the curren= t rpc header */ > >> - =C2=A0 =C2=A0 memcpy((char *)resv->iov_base + len, entry->ce_dat= av.iov_base, > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 entry->ce_datav.iov_le= n); > >> - =C2=A0 =C2=A0 resv->iov_len =3D len + entry->ce_datav.iov_len; > >> - =C2=A0 =C2=A0 return 1; > >> + =C2=A0 =C2=A0 slot->sl_datalen =3D (char *)resp->p - (char *)res= p->cstate.datap; > >> + =C2=A0 =C2=A0 base =3D (char *)resp->cstate.datap - > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (char *)res= p->xbuf->head[0].iov_base; > >> + =C2=A0 =C2=A0 if (read_bytes_from_xdr_buf(resp->xbuf, base, slot= ->sl_data, > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 slot->sl_datalen)) > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 printk(KERN_WARNING > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 "nfsd: sessions DRC could not cache compound\n"); > > > > I'd make this WARN("nfsd:...") just to make it completely clear it'= s a > > kernel bug. =C2=A0(This case should be caught by nfsd4_check_drc_li= mit unless > > we've messed something up, right?) > > > >> + =C2=A0 =C2=A0 return; > >> =C2=A0} > >> > >> =C2=A0/* > >> @@ -1091,14 +1024,14 @@ nfsd4_enc_sequence_replay(struct nfsd4_com= poundargs *args, > >> =C2=A0 =C2=A0 =C2=A0 struct nfsd4_slot *slot =3D resp->cstate.slot= ; > >> > >> =C2=A0 =C2=A0 =C2=A0 dprintk("--> %s resp->opcnt %d cachethis %u \= n", __func__, > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 resp->opcnt, resp->cst= ate.slot->sl_cache_entry.ce_cachethis); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 resp->opcnt, resp->cst= ate.slot->sl_cachethis); > >> > >> =C2=A0 =C2=A0 =C2=A0 /* Encode the replayed sequence operation */ > >> =C2=A0 =C2=A0 =C2=A0 op =3D &args->ops[resp->opcnt - 1]; > >> =C2=A0 =C2=A0 =C2=A0 nfsd4_encode_operation(resp, op); > >> > >> =C2=A0 =C2=A0 =C2=A0 /* Return nfserr_retry_uncached_rep in next o= peration. */ > >> - =C2=A0 =C2=A0 if (args->opcnt > 1 && slot->sl_cache_entry.ce_cac= hethis =3D=3D 0) { > >> + =C2=A0 =C2=A0 if (args->opcnt > 1 && slot->sl_cachethis =3D=3D 0= ) { > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 op =3D &args->ops= [resp->opcnt++]; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 op->status =3D nf= serr_retry_uncached_rep; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 nfsd4_encode_oper= ation(resp, op); > >> @@ -1107,57 +1040,29 @@ nfsd4_enc_sequence_replay(struct nfsd4_com= poundargs *args, > >> =C2=A0} > >> > >> =C2=A0/* > >> - * Keep the first page of the replay. Copy the NFSv4.1 data from = the first > >> - * cached page. =C2=A0Replace any futher replay pages from the ca= che. > >> + * The sequence operation is not cached because we can use the sl= ot and > >> + * session values. > >> =C2=A0 */ > >> =C2=A0__be32 > >> =C2=A0nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp, > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0struct nfsd4_sequence *seq) > >> =C2=A0{ > >> - =C2=A0 =C2=A0 struct nfsd4_cache_entry *entry =3D &resp->cstate.= slot->sl_cache_entry; > >> + =C2=A0 =C2=A0 struct nfsd4_slot *slot =3D resp->cstate.slot; > >> =C2=A0 =C2=A0 =C2=A0 __be32 status; > >> > >> - =C2=A0 =C2=A0 dprintk("--> %s entry %p\n", __func__, entry); > >> - > >> - =C2=A0 =C2=A0 /* > >> - =C2=A0 =C2=A0 =C2=A0* If this is just the sequence operation, we= did not keep > >> - =C2=A0 =C2=A0 =C2=A0* a page in the cache entry because we can j= ust use the > >> - =C2=A0 =C2=A0 =C2=A0* slot info stored in struct nfsd4_sequence = that was checked > >> - =C2=A0 =C2=A0 =C2=A0* against the slot in nfsd4_sequence(). > >> - =C2=A0 =C2=A0 =C2=A0* > >> - =C2=A0 =C2=A0 =C2=A0* This occurs when seq->cachethis is FALSE, = or when the client > >> - =C2=A0 =C2=A0 =C2=A0* session inactivity timer fires and a solo = sequence operation > >> - =C2=A0 =C2=A0 =C2=A0* is sent (lease renewal). > >> - =C2=A0 =C2=A0 =C2=A0*/ > >> + =C2=A0 =C2=A0 dprintk("--> %s slot %p\n", __func__, slot); > >> > >> =C2=A0 =C2=A0 =C2=A0 /* Either returns 0 or nfserr_retry_uncached = */ > >> =C2=A0 =C2=A0 =C2=A0 status =3D nfsd4_enc_sequence_replay(resp->rq= stp->rq_argp, resp); > >> =C2=A0 =C2=A0 =C2=A0 if (status =3D=3D nfserr_retry_uncached_rep) > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return status; > >> > >> - =C2=A0 =C2=A0 if (!nfsd41_copy_replay_data(resp, entry)) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* Not enough roo= m to use the replay rpc header, send the > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* cached header.= Release all the allocated result pages. > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/ > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 svc_free_res_pages(res= p->rqstp); > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 nfsd4_copy_pages(resp-= >rqstp->rq_respages, entry->ce_respages, > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 entry->ce_resused); > >> - =C2=A0 =C2=A0 } else { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* Release all but the= first allocated result page */ > >> - > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 resp->rqstp->rq_resuse= d--; > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 svc_free_res_pages(res= p->rqstp); > >> - > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 nfsd4_copy_pages(&resp= ->rqstp->rq_respages[1], > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0&entry->ce_respages[1], > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0entry->ce_resused - 1); > >> - =C2=A0 =C2=A0 } > >> + =C2=A0 =C2=A0 /* The sequence operation has been encoded, cstate= ->datap set. */ > >> + =C2=A0 =C2=A0 memcpy(resp->cstate.datap, slot->sl_data, slot->sl= _datalen); > >> > >> - =C2=A0 =C2=A0 resp->rqstp->rq_resused =3D entry->ce_resused; > >> - =C2=A0 =C2=A0 resp->opcnt =3D entry->ce_opcnt; > >> - =C2=A0 =C2=A0 resp->cstate.iovlen =3D entry->ce_datav.iov_len + = entry->ce_rpchdrlen; > >> - =C2=A0 =C2=A0 status =3D entry->ce_status; > >> + =C2=A0 =C2=A0 resp->opcnt =3D slot->sl_opcnt; > >> + =C2=A0 =C2=A0 resp->p =3D resp->cstate.datap + XDR_QUADLEN(slot-= >sl_datalen); > >> + =C2=A0 =C2=A0 status =3D slot->sl_status; > >> > >> =C2=A0 =C2=A0 =C2=A0 return status; > >> =C2=A0} > >> @@ -1489,7 +1394,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, > >> =C2=A0 =C2=A0 =C2=A0 if (seq->slotid >=3D session->se_fchannel.max= reqs) > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 goto out; > >> > >> - =C2=A0 =C2=A0 slot =3D &session->se_slots[seq->slotid]; > >> + =C2=A0 =C2=A0 slot =3D session->se_slots[seq->slotid]; > >> =C2=A0 =C2=A0 =C2=A0 dprintk("%s: slotid %d\n", __func__, seq->slo= tid); > >> > >> =C2=A0 =C2=A0 =C2=A0 /* We do not negotiate the number of slots ye= t, so set the > >> @@ -1502,7 +1407,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cstate->slot =3D = slot; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cstate->session =3D= session; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* Return the cac= hed reply status and set cstate->status > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* for nfsd4_svc_= encode_compoundres processing */ > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* for nfsd4_proc= _compound processing */ > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 status =3D nfsd4_= replay_cache_entry(resp, seq); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cstate->status =3D= nfserr_replay_cache; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 goto replay_cache= ; > >> @@ -1513,7 +1418,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, > >> =C2=A0 =C2=A0 =C2=A0 /* Success! bump slot seqid */ > >> =C2=A0 =C2=A0 =C2=A0 slot->sl_inuse =3D true; > >> =C2=A0 =C2=A0 =C2=A0 slot->sl_seqid =3D seq->seqid; > >> - =C2=A0 =C2=A0 slot->sl_cache_entry.ce_cachethis =3D seq->cacheth= is; > >> + =C2=A0 =C2=A0 slot->sl_cachethis =3D seq->cachethis; > >> > >> =C2=A0 =C2=A0 =C2=A0 cstate->slot =3D slot; > >> =C2=A0 =C2=A0 =C2=A0 cstate->session =3D session; > >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > >> index fdf632b..49824ea 100644 > >> --- a/fs/nfsd/nfs4xdr.c > >> +++ b/fs/nfsd/nfs4xdr.c > >> @@ -3064,6 +3064,7 @@ nfsd4_encode_sequence(struct nfsd4_compoundr= es *resp, int nfserr, > >> =C2=A0 =C2=A0 =C2=A0 WRITE32(0); > >> > >> =C2=A0 =C2=A0 =C2=A0 ADJUST_ARGS(); > >> + =C2=A0 =C2=A0 resp->cstate.datap =3D p; /* DRC cache data pointe= r */ > >> =C2=A0 =C2=A0 =C2=A0 return 0; > >> =C2=A0} > >> > >> @@ -3166,7 +3167,7 @@ static int nfsd4_check_drc_limit(struct nfsd= 4_compoundres *resp) > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return status; > >> > >> =C2=A0 =C2=A0 =C2=A0 session =3D resp->cstate.session; > >> - =C2=A0 =C2=A0 if (session =3D=3D NULL || slot->sl_cache_entry.ce= _cachethis =3D=3D 0) > >> + =C2=A0 =C2=A0 if (session =3D=3D NULL || slot->sl_cachethis =3D=3D= 0) > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return status; > >> > >> =C2=A0 =C2=A0 =C2=A0 if (resp->opcnt >=3D args->opcnt) > >> @@ -3291,6 +3292,7 @@ nfs4svc_encode_compoundres(struct svc_rqst *= rqstp, __be32 *p, struct nfsd4_compo > >> =C2=A0 =C2=A0 =C2=A0 /* > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0* All that remains is to write the tag = and operation count... > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0*/ > >> + =C2=A0 =C2=A0 struct nfsd4_compound_state *cs =3D &resp->cstate; > >> =C2=A0 =C2=A0 =C2=A0 struct kvec *iov; > >> =C2=A0 =C2=A0 =C2=A0 p =3D resp->tagp; > >> =C2=A0 =C2=A0 =C2=A0 *p++ =3D htonl(resp->taglen); > >> @@ -3304,14 +3306,11 @@ nfs4svc_encode_compoundres(struct svc_rqst= *rqstp, __be32 *p, struct nfsd4_compo > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 iov =3D &rqstp->r= q_res.head[0]; > >> =C2=A0 =C2=A0 =C2=A0 iov->iov_len =3D ((char*)resp->p) - (char*)io= v->iov_base; > >> =C2=A0 =C2=A0 =C2=A0 BUG_ON(iov->iov_len > PAGE_SIZE); > >> - =C2=A0 =C2=A0 if (nfsd4_has_session(&resp->cstate)) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (resp->cstate.statu= s =3D=3D nfserr_replay_cache && > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 !nfsd4_not_cached(resp)) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 iov->iov_len =3D resp->cstate.iovlen; > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } else { > >> + =C2=A0 =C2=A0 if (nfsd4_has_session(cs)) { > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (cs->status !=3D nf= serr_replay_cache) { > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 nfsd4_store_cache_entry(resp); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 dprintk("%s: SET SLOT STATE TO AVAILABLE\n", __func__); > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 resp->cstate.slot->sl_inuse =3D 0; > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 resp->cstate.slot->sl_inuse =3D false; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 nfsd4_put_session= (resp->cstate.session); > >> =C2=A0 =C2=A0 =C2=A0 } > >> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c > >> index d68cd05..944ef01 100644 > >> --- a/fs/nfsd/nfssvc.c > >> +++ b/fs/nfsd/nfssvc.c > >> @@ -576,10 +576,6 @@ nfsd_dispatch(struct svc_rqst *rqstp, __be32 = *statp) > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 + rqstp->rq_res.h= ead[0].iov_len; > >> =C2=A0 =C2=A0 =C2=A0 rqstp->rq_res.head[0].iov_len +=3D sizeof(__b= e32); > >> > >> - =C2=A0 =C2=A0 /* NFSv4.1 DRC requires statp */ > >> - =C2=A0 =C2=A0 if (rqstp->rq_vers =3D=3D 4) > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 nfsd4_set_statp(rqstp,= statp); > >> - > >> =C2=A0 =C2=A0 =C2=A0 /* Now call the procedure handler, and encode= NFS status. */ > >> =C2=A0 =C2=A0 =C2=A0 nfserr =3D proc->pc_func(rqstp, rqstp->rq_arg= p, rqstp->rq_resp); > >> =C2=A0 =C2=A0 =C2=A0 nfserr =3D map_new_errors(rqstp->rq_vers, nfs= err); > >> diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state= =2Eh > >> index ff0b771..e745100 100644 > >> --- a/include/linux/nfsd/state.h > >> +++ b/include/linux/nfsd/state.h > >> @@ -94,30 +94,23 @@ struct nfs4_cb_conn { > >> > >> =C2=A0/* Maximum number of slots per session. 160 is useful for lo= ng haul TCP */ > >> =C2=A0#define NFSD_MAX_SLOTS_PER_SESSION =C2=A0 =C2=A0 160 > >> -/* Maximum number of pages per slot cache entry */ > >> -#define NFSD_PAGES_PER_SLOT =C2=A01 > >> -#define NFSD_SLOT_CACHE_SIZE =C2=A0 =C2=A0 =C2=A0 =C2=A0 PAGE_SIZ= E > >> =C2=A0/* Maximum number of operations per session compound */ > >> =C2=A0#define NFSD_MAX_OPS_PER_COMPOUND =C2=A0 =C2=A016 > >> +/* Maximum =C2=A0session per slot cache size */ > >> +#define NFSD_SLOT_CACHE_SIZE =C2=A0 =C2=A0 =C2=A0 =C2=A0 1024 > >> =C2=A0/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session = */ > >> =C2=A0#define NFSD_CACHE_SIZE_SLOTS_PER_SESSION =C2=A0 =C2=A032 > >> =C2=A0#define NFSD_MAX_MEM_PER_SESSION =C2=A0\ > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (NFSD_CACHE_SIZE_= SLOTS_PER_SESSION * NFSD_SLOT_CACHE_SIZE) > >> > >> -struct nfsd4_cache_entry { > >> - =C2=A0 =C2=A0 __be32 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ce_status= ; > >> - =C2=A0 =C2=A0 struct kvec =C2=A0 =C2=A0 ce_datav; /* encoded NFS= v4.1 data in rq_res.head[0] */ > >> - =C2=A0 =C2=A0 struct page =C2=A0 =C2=A0 *ce_respages[NFSD_PAGES_= PER_SLOT + 1]; > >> - =C2=A0 =C2=A0 int =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ce_c= achethis; > >> - =C2=A0 =C2=A0 short =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ce_resuse= d; > >> - =C2=A0 =C2=A0 int =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ce_o= pcnt; > >> - =C2=A0 =C2=A0 int =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ce_r= pchdrlen; > >> -}; > >> - > >> =C2=A0struct nfsd4_slot { > >> - =C2=A0 =C2=A0 bool =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sl_inuse; > >> - =C2=A0 =C2=A0 u32 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sl_seqid; > >> - =C2=A0 =C2=A0 struct nfsd4_cache_entry =C2=A0 =C2=A0 =C2=A0 =C2=A0= sl_cache_entry; > >> + =C2=A0 =C2=A0 bool =C2=A0 =C2=A0sl_inuse; > >> + =C2=A0 =C2=A0 u32 =C2=A0 =C2=A0 sl_seqid; > >> + =C2=A0 =C2=A0 int =C2=A0 =C2=A0 sl_cachethis; > >> + =C2=A0 =C2=A0 int =C2=A0 =C2=A0 sl_opcnt; > >> + =C2=A0 =C2=A0 __be32 =C2=A0sl_status; > >> + =C2=A0 =C2=A0 u32 =C2=A0 =C2=A0 sl_datalen; > >> + =C2=A0 =C2=A0 char =C2=A0 =C2=A0sl_data[]; > > > > Could you just move sl_inuse to the end? =C2=A0It'll save a few byt= es in the > > structure (because the compiler will probably stick 3 bytes after i= t to > > align sl_seqid.) >=20 > How about this? >=20 > struct nfsd4_slot { > - bool sl_inuse; > - u32 sl_seqid; > - struct nfsd4_cache_entry sl_cache_entry; > + bool sl_inuse; > + bool sl_cachethis; > + u16 sl_opcnt; > + u32 sl_seqid; > + __be32 sl_status; > + u32 sl_datalen; > + char sl_data[]; > }; Looks OK to me! By the way, on recent distro's you should be able to run apt-get install/yum install dwarves, then pahole fs/nfsd/nfsd.o and look through the output for the struct of interest: struct nfsd4_slot { bool sl_inuse; /* 0 1= */ /* XXX 3 bytes hole, try to pack */ u32 sl_seqid; /* 4 4= */ struct nfsd4_cache_entry sl_cache_entry; /* 8 36= */ /* size: 44, cachelines: 1 */ /* sum members: 41, holes: 1, sum holes: 3 */ /* last cacheline: 44 bytes */ }; /* definitions: 3 */ (But no point obsessing over this kind of thing; I just happened to notice the problem in this patch and figured for such a common structur= e it might be worth mentioning.) --b. >=20 > -->Andy >=20 > > --b. > > > >> =C2=A0}; > >> > >> =C2=A0struct nfsd4_channel_attrs { > >> @@ -159,7 +152,7 @@ struct nfsd4_session { > >> =C2=A0 =C2=A0 =C2=A0 struct nfs4_sessionid =C2=A0 se_sessionid; > >> =C2=A0 =C2=A0 =C2=A0 struct nfsd4_channel_attrs se_fchannel; > >> =C2=A0 =C2=A0 =C2=A0 struct nfsd4_channel_attrs se_bchannel; > >> - =C2=A0 =C2=A0 struct nfsd4_slot =C2=A0 =C2=A0 =C2=A0 se_slots[];= =C2=A0 =C2=A0 /* forward channel slots */ > >> + =C2=A0 =C2=A0 struct nfsd4_slot =C2=A0 =C2=A0 =C2=A0 *se_slots[]= ; =C2=A0 =C2=A0/* forward channel slots */ > >> =C2=A0}; > >> > >> =C2=A0static inline void > >> diff --git a/include/linux/nfsd/xdr4.h b/include/linux/nfsd/xdr4.h > >> index 3f71660..73164c2 100644 > >> --- a/include/linux/nfsd/xdr4.h > >> +++ b/include/linux/nfsd/xdr4.h > >> @@ -51,7 +51,7 @@ struct nfsd4_compound_state { > >> =C2=A0 =C2=A0 =C2=A0 /* For sessions DRC */ > >> =C2=A0 =C2=A0 =C2=A0 struct nfsd4_session =C2=A0 =C2=A0*session; > >> =C2=A0 =C2=A0 =C2=A0 struct nfsd4_slot =C2=A0 =C2=A0 =C2=A0 *slot; > >> - =C2=A0 =C2=A0 __be32 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0*statp; > >> + =C2=A0 =C2=A0 __be32 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0*datap; > >> =C2=A0 =C2=A0 =C2=A0 size_t =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0iovlen; > >> =C2=A0 =C2=A0 =C2=A0 u32 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 minorversion; > >> =C2=A0 =C2=A0 =C2=A0 u32 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 status; > >> @@ -472,8 +472,7 @@ static inline bool nfsd4_is_solo_sequence(stru= ct nfsd4_compoundres *resp) > >> > >> =C2=A0static inline bool nfsd4_not_cached(struct nfsd4_compoundres= *resp) > >> =C2=A0{ > >> - =C2=A0 =C2=A0 return !resp->cstate.slot->sl_cache_entry.ce_cache= this || > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 nfsd4_is_solo_sequence(resp); > >> + =C2=A0 =C2=A0 return !resp->cstate.slot->sl_cachethis || nfsd4_i= s_solo_sequence(resp); > >> =C2=A0} > >> > >> =C2=A0#define NFS4_SVC_XDRSIZE =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 sizeof(struct nfsd4_compoundargs) > >> -- > >> 1.6.2.5 > >> > > _______________________________________________ > > pNFS mailing list > > pNFS@linux-nfs.org > > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [pnfs] [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC 2009-09-01 15:14 ` J. Bruce Fields @ 2009-09-01 15:22 ` William A. (Andy) Adamson 0 siblings, 0 replies; 23+ messages in thread From: William A. (Andy) Adamson @ 2009-09-01 15:22 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs, pnfs On Tue, Sep 1, 2009 at 11:14 AM, J. Bruce Fields<bfields@fieldses.org> = wrote: > On Tue, Sep 01, 2009 at 09:48:41AM -0400, William A. (Andy) Adamson w= rote: >> On Fri, Aug 28, 2009 at 5:33 PM, J. Bruce Fields<bfields-uC3wQj2KruMpug/h7KTFAQ@public.gmane.org= g> wrote: >> > On Thu, Aug 27, 2009 at 12:07:44PM -0400, andros@netapp.com wrote: >> >> From: Andy Adamson <andros@netapp.com> >> >> >> >> Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of= holding nfsd >> >> pages in cache. >> >> >> >> Connectathon testing has shown that 1024 bytes for encoded compou= nd operation >> >> responses past the sequence operation is sufficient, 512 bytes is= a little too >> >> small. Set NFSD_SLOT_CACHE_SIZE to 1024. >> >> >> >> Allocate memory for the session DRC in the CREATE_SESSION operati= on >> >> to guarantee that the memory resource is available for caching re= sponses. >> >> Allocate each slot individually in preparation for slot table siz= e negotiation. >> >> >> >> Remove struct nfsd4_cache_entry and helper functions for the old = page-based >> >> DRC. >> >> >> >> The iov_len calculation in nfs4svc_encode_compoundres is now alwa= ys >> >> correct, clean up the nfs4svc_encode_compoundres session logic. >> >> >> >> The nfsd4_compound_state statp pointer is also not used. >> >> Remove nfsd4_set_statp(). >> >> >> >> Move useful nfsd4_cache_entry fields into nfsd4_slot. >> >> >> >> Signed-off-by: Andy Adamson <andros@netapp.com >> >> --- >> >> =A0fs/nfsd/nfs4state.c =A0 =A0 =A0 =A0| =A0207 ++++++++++++------= -------------------------- >> >> =A0fs/nfsd/nfs4xdr.c =A0 =A0 =A0 =A0 =A0| =A0 13 ++-- >> >> =A0fs/nfsd/nfssvc.c =A0 =A0 =A0 =A0 =A0 | =A0 =A04 - >> >> =A0include/linux/nfsd/state.h | =A0 27 ++---- >> >> =A0include/linux/nfsd/xdr4.h =A0| =A0 =A05 +- >> >> =A05 files changed, 74 insertions(+), 182 deletions(-) >> >> >> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >> >> index 4695cec..2d72d5c 100644 >> >> --- a/fs/nfsd/nfs4state.c >> >> +++ b/fs/nfsd/nfs4state.c >> >> @@ -510,12 +510,22 @@ static int init_forechannel_attrs(struct sv= c_rqst *rqstp, >> >> =A0 =A0 =A0 return status; >> >> =A0} >> >> >> >> +static void >> >> +free_session_slots(struct nfsd4_session *ses) >> >> +{ >> >> + =A0 =A0 int i; >> >> + >> >> + =A0 =A0 for (i =3D 0; i < ses->se_fchannel.maxreqs; i++) >> >> + =A0 =A0 =A0 =A0 =A0 =A0 kfree(ses->se_slots[i]); >> >> +} >> >> + >> >> =A0static int >> >> =A0alloc_init_session(struct svc_rqst *rqstp, struct nfs4_client = *clp, >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfsd4_create_session *c= ses) >> >> =A0{ >> >> =A0 =A0 =A0 struct nfsd4_session *new, tmp; >> >> - =A0 =A0 int idx, status =3D nfserr_serverfault, slotsize; >> >> + =A0 =A0 struct nfsd4_slot *sp; >> >> + =A0 =A0 int idx, status =3D nfserr_serverfault, slotsize, cache= size, i; >> > >> > Just as a style thing: that list's getting a little long. =A0Could= you >> > keep at least "status" on a separate line? >> > >> >> >> >> =A0 =A0 =A0 memset(&tmp, 0, sizeof(tmp)); >> >> >> >> @@ -526,14 +536,23 @@ alloc_init_session(struct svc_rqst *rqstp, = struct nfs4_client *clp, >> >> =A0 =A0 =A0 if (status) >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; >> >> >> >> - =A0 =A0 /* allocate struct nfsd4_session and slot table in one = piece */ >> >> - =A0 =A0 slotsize =3D tmp.se_fchannel.maxreqs * sizeof(struct nf= sd4_slot); >> >> + =A0 =A0 /* allocate struct nfsd4_session and slot table pointer= s in one piece */ >> >> + =A0 =A0 slotsize =3D tmp.se_fchannel.maxreqs * sizeof(struct nf= sd4_slot *); >> >> =A0 =A0 =A0 new =3D kzalloc(sizeof(*new) + slotsize, GFP_KERNEL); >> > >> > I think this is OK for now, but maybe stick something like: >> > >> > =A0 =A0 =A0 =A0BUILD_BUG_ON(NFSD_MAX_SLOTS_PER_SESSION * sizeof(st= ruct nfsd4_slot) >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+ sizeof(struct nfs= d4_session) > PAGE_SIZE); >> > >> > in state.h just to warn anyone who wants to blindly bump up >> > NFSD_MAX_SLOTS_PER_SESSION. =A0(It's not really forbidden to kmall= oc more >> > than a page, but it's also not reliable, and if it becomes necessa= ry >> > then we'd rather find some way to code around it.) >> > >> >> =A0 =A0 =A0 if (!new) >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; >> >> >> >> =A0 =A0 =A0 memcpy(new, &tmp, sizeof(*new)); >> >> >> >> + =A0 =A0 /* allocate each struct nfsd4_slot and data cache in on= e piece */ >> >> + =A0 =A0 cachesize =3D new->se_fchannel.maxresp_cached - NFSD_MI= N_HDR_SEQ_SZ; >> >> + =A0 =A0 for (i =3D 0; i < new->se_fchannel.maxreqs; i++) { >> >> + =A0 =A0 =A0 =A0 =A0 =A0 sp =3D kzalloc(sizeof(*sp) + cachesize,= GFP_KERNEL); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 if (!sp) >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out_free; >> >> + =A0 =A0 =A0 =A0 =A0 =A0 new->se_slots[i] =3D sp; >> >> + =A0 =A0 } >> >> + >> >> =A0 =A0 =A0 new->se_client =3D clp; >> >> =A0 =A0 =A0 gen_sessionid(new); >> >> =A0 =A0 =A0 idx =3D hash_sessionid(&new->se_sessionid); >> >> @@ -550,6 +569,10 @@ alloc_init_session(struct svc_rqst *rqstp, s= truct nfs4_client *clp, >> >> =A0 =A0 =A0 status =3D nfs_ok; >> >> =A0out: >> >> =A0 =A0 =A0 return status; >> >> +out_free: >> >> + =A0 =A0 free_session_slots(new); >> >> + =A0 =A0 kfree(new); >> >> + =A0 =A0 goto out; >> >> =A0} >> >> >> >> =A0/* caller must hold sessionid_lock */ >> >> @@ -592,22 +615,16 @@ release_session(struct nfsd4_session *ses) >> >> =A0 =A0 =A0 nfsd4_put_session(ses); >> >> =A0} >> >> >> >> -static void nfsd4_release_respages(struct page **respages, short= resused); >> >> - >> >> =A0void >> >> =A0free_session(struct kref *kref) >> >> =A0{ >> >> =A0 =A0 =A0 struct nfsd4_session *ses; >> >> - =A0 =A0 int i; >> >> >> >> =A0 =A0 =A0 ses =3D container_of(kref, struct nfsd4_session, se_r= ef); >> >> - =A0 =A0 for (i =3D 0; i < ses->se_fchannel.maxreqs; i++) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 struct nfsd4_cache_entry *e =3D &ses->s= e_slots[i].sl_cache_entry; >> >> - =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_release_respages(e->ce_respages, = e->ce_resused); >> >> - =A0 =A0 } >> >> =A0 =A0 =A0 spin_lock(&nfsd_drc_lock); >> >> =A0 =A0 =A0 nfsd_drc_mem_used -=3D ses->se_fchannel.maxreqs * NFS= D_SLOT_CACHE_SIZE; >> >> =A0 =A0 =A0 spin_unlock(&nfsd_drc_lock); >> >> + =A0 =A0 free_session_slots(ses); >> >> =A0 =A0 =A0 kfree(ses); >> >> =A0} >> >> >> >> @@ -964,116 +981,32 @@ out_err: >> >> =A0 =A0 =A0 return; >> >> =A0} >> >> >> >> -void >> >> -nfsd4_set_statp(struct svc_rqst *rqstp, __be32 *statp) >> >> -{ >> >> - =A0 =A0 struct nfsd4_compoundres *resp =3D rqstp->rq_resp; >> >> - >> >> - =A0 =A0 resp->cstate.statp =3D statp; >> >> -} >> >> - >> >> -/* >> >> - * Dereference the result pages. >> >> - */ >> >> -static void >> >> -nfsd4_release_respages(struct page **respages, short resused) >> >> -{ >> >> - =A0 =A0 int i; >> >> - >> >> - =A0 =A0 dprintk("--> %s\n", __func__); >> >> - =A0 =A0 for (i =3D 0; i < resused; i++) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 if (!respages[i]) >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 continue; >> >> - =A0 =A0 =A0 =A0 =A0 =A0 put_page(respages[i]); >> >> - =A0 =A0 =A0 =A0 =A0 =A0 respages[i] =3D NULL; >> >> - =A0 =A0 } >> >> -} >> >> - >> >> -static void >> >> -nfsd4_copy_pages(struct page **topages, struct page **frompages,= short count) >> >> -{ >> >> - =A0 =A0 int i; >> >> - >> >> - =A0 =A0 for (i =3D 0; i < count; i++) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 topages[i] =3D frompages[i]; >> >> - =A0 =A0 =A0 =A0 =A0 =A0 if (!topages[i]) >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 continue; >> >> - =A0 =A0 =A0 =A0 =A0 =A0 get_page(topages[i]); >> >> - =A0 =A0 } >> >> -} >> >> - >> >> =A0/* >> >> - * Cache the reply pages up to NFSD_PAGES_PER_SLOT + 1, clearing= the previous >> >> - * pages. We add a page to NFSD_PAGES_PER_SLOT for the case wher= e the total >> >> - * length of the XDR response is less than se_fmaxresp_cached >> >> - * (NFSD_PAGES_PER_SLOT * PAGE_SIZE) but the xdr_buf pages is us= ed for a >> >> - * of the reply (e.g. readdir). >> >> - * >> >> - * Store the base and length of the rq_req.head[0] page >> >> - * of the NFSv4.1 data, just past the rpc header. >> >> + * Cache a reply. nfsd4_check_drc_limit() has bounded the cache = size. >> >> =A0 */ >> >> =A0void >> >> =A0nfsd4_store_cache_entry(struct nfsd4_compoundres *resp) >> >> =A0{ >> >> - =A0 =A0 struct nfsd4_cache_entry *entry =3D &resp->cstate.slot-= >sl_cache_entry; >> >> - =A0 =A0 struct svc_rqst *rqstp =3D resp->rqstp; >> >> - =A0 =A0 struct kvec *resv =3D &rqstp->rq_res.head[0]; >> >> - >> >> - =A0 =A0 dprintk("--> %s entry %p\n", __func__, entry); >> >> + =A0 =A0 struct nfsd4_slot *slot =3D resp->cstate.slot; >> >> + =A0 =A0 unsigned int base; >> >> >> >> - =A0 =A0 nfsd4_release_respages(entry->ce_respages, entry->ce_re= sused); >> >> - =A0 =A0 entry->ce_opcnt =3D resp->opcnt; >> >> - =A0 =A0 entry->ce_status =3D resp->cstate.status; >> >> + =A0 =A0 dprintk("--> %s slot %p\n", __func__, slot); >> >> >> >> - =A0 =A0 /* >> >> - =A0 =A0 =A0* Don't need a page to cache just the sequence opera= tion - the slot >> >> - =A0 =A0 =A0* does this for us! >> >> - =A0 =A0 =A0*/ >> >> + =A0 =A0 slot->sl_opcnt =3D resp->opcnt; >> >> + =A0 =A0 slot->sl_status =3D resp->cstate.status; >> >> >> >> =A0 =A0 =A0 if (nfsd4_not_cached(resp)) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_resused =3D 0; >> >> - =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_rpchdrlen =3D 0; >> >> - =A0 =A0 =A0 =A0 =A0 =A0 dprintk("%s Just cache SEQUENCE. ce_cac= hethis %d\n", __func__, >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 resp->cstate.slot->sl_c= ache_entry.ce_cachethis); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 slot->sl_datalen =3D 0; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; >> >> =A0 =A0 =A0 } >> >> - =A0 =A0 entry->ce_resused =3D rqstp->rq_resused; >> >> - =A0 =A0 if (entry->ce_resused > NFSD_PAGES_PER_SLOT + 1) >> >> - =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_resused =3D NFSD_PAGES_PER_SL= OT + 1; >> >> - =A0 =A0 nfsd4_copy_pages(entry->ce_respages, rqstp->rq_respages= , >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0entry->ce_resused); >> >> - =A0 =A0 entry->ce_datav.iov_base =3D resp->cstate.statp; >> >> - =A0 =A0 entry->ce_datav.iov_len =3D resv->iov_len - ((char *)re= sp->cstate.statp - >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (char *= )page_address(rqstp->rq_respages[0])); >> >> - =A0 =A0 /* Current request rpc header length*/ >> >> - =A0 =A0 entry->ce_rpchdrlen =3D (char *)resp->cstate.statp - >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (char *= )page_address(rqstp->rq_respages[0]); >> >> -} >> >> - >> >> -/* >> >> - * We keep the rpc header, but take the nfs reply from the reply= cache. >> >> - */ >> >> -static int >> >> -nfsd41_copy_replay_data(struct nfsd4_compoundres *resp, >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct nfsd4_cache_entr= y *entry) >> >> -{ >> >> - =A0 =A0 struct svc_rqst *rqstp =3D resp->rqstp; >> >> - =A0 =A0 struct kvec *resv =3D &resp->rqstp->rq_res.head[0]; >> >> - =A0 =A0 int len; >> >> - >> >> - =A0 =A0 /* Current request rpc header length*/ >> >> - =A0 =A0 len =3D (char *)resp->cstate.statp - >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (char *)page_address(rq= stp->rq_respages[0]); >> >> - =A0 =A0 if (entry->ce_datav.iov_len + len > PAGE_SIZE) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 dprintk("%s v41 cached reply too large = (%Zd).\n", __func__, >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_datav.iov_len= ); >> >> - =A0 =A0 =A0 =A0 =A0 =A0 return 0; >> >> - =A0 =A0 } >> >> - =A0 =A0 /* copy the cached reply nfsd data past the current rpc= header */ >> >> - =A0 =A0 memcpy((char *)resv->iov_base + len, entry->ce_datav.io= v_base, >> >> - =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_datav.iov_len); >> >> - =A0 =A0 resv->iov_len =3D len + entry->ce_datav.iov_len; >> >> - =A0 =A0 return 1; >> >> + =A0 =A0 slot->sl_datalen =3D (char *)resp->p - (char *)resp->cs= tate.datap; >> >> + =A0 =A0 base =3D (char *)resp->cstate.datap - >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 (char *)resp->xbuf->head[0].iov_base; >> >> + =A0 =A0 if (read_bytes_from_xdr_buf(resp->xbuf, base, slot->sl_= data, >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= slot->sl_datalen)) >> >> + =A0 =A0 =A0 =A0 =A0 =A0 printk(KERN_WARNING >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 "nfsd: sessions DRC cou= ld not cache compound\n"); >> > >> > I'd make this WARN("nfsd:...") just to make it completely clear it= 's a >> > kernel bug. =A0(This case should be caught by nfsd4_check_drc_limi= t unless >> > we've messed something up, right?) >> > >> >> + =A0 =A0 return; >> >> =A0} >> >> >> >> =A0/* >> >> @@ -1091,14 +1024,14 @@ nfsd4_enc_sequence_replay(struct nfsd4_co= mpoundargs *args, >> >> =A0 =A0 =A0 struct nfsd4_slot *slot =3D resp->cstate.slot; >> >> >> >> =A0 =A0 =A0 dprintk("--> %s resp->opcnt %d cachethis %u \n", __fu= nc__, >> >> - =A0 =A0 =A0 =A0 =A0 =A0 resp->opcnt, resp->cstate.slot->sl_cach= e_entry.ce_cachethis); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 resp->opcnt, resp->cstate.slot->sl_cach= ethis); >> >> >> >> =A0 =A0 =A0 /* Encode the replayed sequence operation */ >> >> =A0 =A0 =A0 op =3D &args->ops[resp->opcnt - 1]; >> >> =A0 =A0 =A0 nfsd4_encode_operation(resp, op); >> >> >> >> =A0 =A0 =A0 /* Return nfserr_retry_uncached_rep in next operation= =2E */ >> >> - =A0 =A0 if (args->opcnt > 1 && slot->sl_cache_entry.ce_cachethi= s =3D=3D 0) { >> >> + =A0 =A0 if (args->opcnt > 1 && slot->sl_cachethis =3D=3D 0) { >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 op =3D &args->ops[resp->opcnt++]; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 op->status =3D nfserr_retry_uncached_= rep; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_encode_operation(resp, op); >> >> @@ -1107,57 +1040,29 @@ nfsd4_enc_sequence_replay(struct nfsd4_co= mpoundargs *args, >> >> =A0} >> >> >> >> =A0/* >> >> - * Keep the first page of the replay. Copy the NFSv4.1 data from= the first >> >> - * cached page. =A0Replace any futher replay pages from the cach= e. >> >> + * The sequence operation is not cached because we can use the s= lot and >> >> + * session values. >> >> =A0 */ >> >> =A0__be32 >> >> =A0nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp, >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfsd4_seque= nce *seq) >> >> =A0{ >> >> - =A0 =A0 struct nfsd4_cache_entry *entry =3D &resp->cstate.slot-= >sl_cache_entry; >> >> + =A0 =A0 struct nfsd4_slot *slot =3D resp->cstate.slot; >> >> =A0 =A0 =A0 __be32 status; >> >> >> >> - =A0 =A0 dprintk("--> %s entry %p\n", __func__, entry); >> >> - >> >> - =A0 =A0 /* >> >> - =A0 =A0 =A0* If this is just the sequence operation, we did not= keep >> >> - =A0 =A0 =A0* a page in the cache entry because we can just use = the >> >> - =A0 =A0 =A0* slot info stored in struct nfsd4_sequence that was= checked >> >> - =A0 =A0 =A0* against the slot in nfsd4_sequence(). >> >> - =A0 =A0 =A0* >> >> - =A0 =A0 =A0* This occurs when seq->cachethis is FALSE, or when = the client >> >> - =A0 =A0 =A0* session inactivity timer fires and a solo sequence= operation >> >> - =A0 =A0 =A0* is sent (lease renewal). >> >> - =A0 =A0 =A0*/ >> >> + =A0 =A0 dprintk("--> %s slot %p\n", __func__, slot); >> >> >> >> =A0 =A0 =A0 /* Either returns 0 or nfserr_retry_uncached */ >> >> =A0 =A0 =A0 status =3D nfsd4_enc_sequence_replay(resp->rqstp->rq_= argp, resp); >> >> =A0 =A0 =A0 if (status =3D=3D nfserr_retry_uncached_rep) >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 return status; >> >> >> >> - =A0 =A0 if (!nfsd41_copy_replay_data(resp, entry)) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 /* >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0* Not enough room to use the replay = rpc header, send the >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0* cached header. Release all the all= ocated result pages. >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0*/ >> >> - =A0 =A0 =A0 =A0 =A0 =A0 svc_free_res_pages(resp->rqstp); >> >> - =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_copy_pages(resp->rqstp->rq_respag= es, entry->ce_respages, >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 entry->ce_resused); >> >> - =A0 =A0 } else { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 /* Release all but the first allocated = result page */ >> >> - >> >> - =A0 =A0 =A0 =A0 =A0 =A0 resp->rqstp->rq_resused--; >> >> - =A0 =A0 =A0 =A0 =A0 =A0 svc_free_res_pages(resp->rqstp); >> >> - >> >> - =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_copy_pages(&resp->rqstp->rq_respa= ges[1], >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0&ent= ry->ce_respages[1], >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0entr= y->ce_resused - 1); >> >> - =A0 =A0 } >> >> + =A0 =A0 /* The sequence operation has been encoded, cstate->dat= ap set. */ >> >> + =A0 =A0 memcpy(resp->cstate.datap, slot->sl_data, slot->sl_data= len); >> >> >> >> - =A0 =A0 resp->rqstp->rq_resused =3D entry->ce_resused; >> >> - =A0 =A0 resp->opcnt =3D entry->ce_opcnt; >> >> - =A0 =A0 resp->cstate.iovlen =3D entry->ce_datav.iov_len + entry= ->ce_rpchdrlen; >> >> - =A0 =A0 status =3D entry->ce_status; >> >> + =A0 =A0 resp->opcnt =3D slot->sl_opcnt; >> >> + =A0 =A0 resp->p =3D resp->cstate.datap + XDR_QUADLEN(slot->sl_d= atalen); >> >> + =A0 =A0 status =3D slot->sl_status; >> >> >> >> =A0 =A0 =A0 return status; >> >> =A0} >> >> @@ -1489,7 +1394,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, >> >> =A0 =A0 =A0 if (seq->slotid >=3D session->se_fchannel.maxreqs) >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; >> >> >> >> - =A0 =A0 slot =3D &session->se_slots[seq->slotid]; >> >> + =A0 =A0 slot =3D session->se_slots[seq->slotid]; >> >> =A0 =A0 =A0 dprintk("%s: slotid %d\n", __func__, seq->slotid); >> >> >> >> =A0 =A0 =A0 /* We do not negotiate the number of slots yet, so se= t the >> >> @@ -1502,7 +1407,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 cstate->slot =3D slot; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 cstate->session =3D session; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* Return the cached reply status and= set cstate->status >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0* for nfsd4_svc_encode_compoundres p= rocessing */ >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0* for nfsd4_proc_compound processing= */ >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 status =3D nfsd4_replay_cache_entry(r= esp, seq); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 cstate->status =3D nfserr_replay_cach= e; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto replay_cache; >> >> @@ -1513,7 +1418,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, >> >> =A0 =A0 =A0 /* Success! bump slot seqid */ >> >> =A0 =A0 =A0 slot->sl_inuse =3D true; >> >> =A0 =A0 =A0 slot->sl_seqid =3D seq->seqid; >> >> - =A0 =A0 slot->sl_cache_entry.ce_cachethis =3D seq->cachethis; >> >> + =A0 =A0 slot->sl_cachethis =3D seq->cachethis; >> >> >> >> =A0 =A0 =A0 cstate->slot =3D slot; >> >> =A0 =A0 =A0 cstate->session =3D session; >> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c >> >> index fdf632b..49824ea 100644 >> >> --- a/fs/nfsd/nfs4xdr.c >> >> +++ b/fs/nfsd/nfs4xdr.c >> >> @@ -3064,6 +3064,7 @@ nfsd4_encode_sequence(struct nfsd4_compound= res *resp, int nfserr, >> >> =A0 =A0 =A0 WRITE32(0); >> >> >> >> =A0 =A0 =A0 ADJUST_ARGS(); >> >> + =A0 =A0 resp->cstate.datap =3D p; /* DRC cache data pointer */ >> >> =A0 =A0 =A0 return 0; >> >> =A0} >> >> >> >> @@ -3166,7 +3167,7 @@ static int nfsd4_check_drc_limit(struct nfs= d4_compoundres *resp) >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 return status; >> >> >> >> =A0 =A0 =A0 session =3D resp->cstate.session; >> >> - =A0 =A0 if (session =3D=3D NULL || slot->sl_cache_entry.ce_cach= ethis =3D=3D 0) >> >> + =A0 =A0 if (session =3D=3D NULL || slot->sl_cachethis =3D=3D 0) >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 return status; >> >> >> >> =A0 =A0 =A0 if (resp->opcnt >=3D args->opcnt) >> >> @@ -3291,6 +3292,7 @@ nfs4svc_encode_compoundres(struct svc_rqst = *rqstp, __be32 *p, struct nfsd4_compo >> >> =A0 =A0 =A0 /* >> >> =A0 =A0 =A0 =A0* All that remains is to write the tag and operati= on count... >> >> =A0 =A0 =A0 =A0*/ >> >> + =A0 =A0 struct nfsd4_compound_state *cs =3D &resp->cstate; >> >> =A0 =A0 =A0 struct kvec *iov; >> >> =A0 =A0 =A0 p =3D resp->tagp; >> >> =A0 =A0 =A0 *p++ =3D htonl(resp->taglen); >> >> @@ -3304,14 +3306,11 @@ nfs4svc_encode_compoundres(struct svc_rqs= t *rqstp, __be32 *p, struct nfsd4_compo >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 iov =3D &rqstp->rq_res.head[0]; >> >> =A0 =A0 =A0 iov->iov_len =3D ((char*)resp->p) - (char*)iov->iov_b= ase; >> >> =A0 =A0 =A0 BUG_ON(iov->iov_len > PAGE_SIZE); >> >> - =A0 =A0 if (nfsd4_has_session(&resp->cstate)) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 if (resp->cstate.status =3D=3D nfserr_r= eplay_cache && >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 !nfsd4_= not_cached(resp)) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 iov->iov_len =3D resp->= cstate.iovlen; >> >> - =A0 =A0 =A0 =A0 =A0 =A0 } else { >> >> + =A0 =A0 if (nfsd4_has_session(cs)) { >> >> + =A0 =A0 =A0 =A0 =A0 =A0 if (cs->status !=3D nfserr_replay_cache= ) { >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_store_cache_ent= ry(resp); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 dprintk("%s: SET SLOT= STATE TO AVAILABLE\n", __func__); >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 resp->cstate.slot->sl_i= nuse =3D 0; >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 resp->cstate.slot->sl_i= nuse =3D false; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_put_session(resp->cstate.sessio= n); >> >> =A0 =A0 =A0 } >> >> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c >> >> index d68cd05..944ef01 100644 >> >> --- a/fs/nfsd/nfssvc.c >> >> +++ b/fs/nfsd/nfssvc.c >> >> @@ -576,10 +576,6 @@ nfsd_dispatch(struct svc_rqst *rqstp, __be32= *statp) >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 + rqstp->rq_res.head[0].iov_len; >> >> =A0 =A0 =A0 rqstp->rq_res.head[0].iov_len +=3D sizeof(__be32); >> >> >> >> - =A0 =A0 /* NFSv4.1 DRC requires statp */ >> >> - =A0 =A0 if (rqstp->rq_vers =3D=3D 4) >> >> - =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_set_statp(rqstp, statp); >> >> - >> >> =A0 =A0 =A0 /* Now call the procedure handler, and encode NFS sta= tus. */ >> >> =A0 =A0 =A0 nfserr =3D proc->pc_func(rqstp, rqstp->rq_argp, rqstp= ->rq_resp); >> >> =A0 =A0 =A0 nfserr =3D map_new_errors(rqstp->rq_vers, nfserr); >> >> diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/stat= e.h >> >> index ff0b771..e745100 100644 >> >> --- a/include/linux/nfsd/state.h >> >> +++ b/include/linux/nfsd/state.h >> >> @@ -94,30 +94,23 @@ struct nfs4_cb_conn { >> >> >> >> =A0/* Maximum number of slots per session. 160 is useful for long= haul TCP */ >> >> =A0#define NFSD_MAX_SLOTS_PER_SESSION =A0 =A0 160 >> >> -/* Maximum number of pages per slot cache entry */ >> >> -#define NFSD_PAGES_PER_SLOT =A01 >> >> -#define NFSD_SLOT_CACHE_SIZE =A0 =A0 =A0 =A0 PAGE_SIZE >> >> =A0/* Maximum number of operations per session compound */ >> >> =A0#define NFSD_MAX_OPS_PER_COMPOUND =A0 =A016 >> >> +/* Maximum =A0session per slot cache size */ >> >> +#define NFSD_SLOT_CACHE_SIZE =A0 =A0 =A0 =A0 1024 >> >> =A0/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */ >> >> =A0#define NFSD_CACHE_SIZE_SLOTS_PER_SESSION =A0 =A032 >> >> =A0#define NFSD_MAX_MEM_PER_SESSION =A0\ >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 (NFSD_CACHE_SIZE_SLOTS_PER_SESSION * = NFSD_SLOT_CACHE_SIZE) >> >> >> >> -struct nfsd4_cache_entry { >> >> - =A0 =A0 __be32 =A0 =A0 =A0 =A0 =A0ce_status; >> >> - =A0 =A0 struct kvec =A0 =A0 ce_datav; /* encoded NFSv4.1 data i= n rq_res.head[0] */ >> >> - =A0 =A0 struct page =A0 =A0 *ce_respages[NFSD_PAGES_PER_SLOT + = 1]; >> >> - =A0 =A0 int =A0 =A0 =A0 =A0 =A0 =A0 ce_cachethis; >> >> - =A0 =A0 short =A0 =A0 =A0 =A0 =A0 ce_resused; >> >> - =A0 =A0 int =A0 =A0 =A0 =A0 =A0 =A0 ce_opcnt; >> >> - =A0 =A0 int =A0 =A0 =A0 =A0 =A0 =A0 ce_rpchdrlen; >> >> -}; >> >> - >> >> =A0struct nfsd4_slot { >> >> - =A0 =A0 bool =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0sl_inuse; >> >> - =A0 =A0 u32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 sl_seqid; >> >> - =A0 =A0 struct nfsd4_cache_entry =A0 =A0 =A0 =A0sl_cache_entry; >> >> + =A0 =A0 bool =A0 =A0sl_inuse; >> >> + =A0 =A0 u32 =A0 =A0 sl_seqid; >> >> + =A0 =A0 int =A0 =A0 sl_cachethis; >> >> + =A0 =A0 int =A0 =A0 sl_opcnt; >> >> + =A0 =A0 __be32 =A0sl_status; >> >> + =A0 =A0 u32 =A0 =A0 sl_datalen; >> >> + =A0 =A0 char =A0 =A0sl_data[]; >> > >> > Could you just move sl_inuse to the end? =A0It'll save a few bytes= in the >> > structure (because the compiler will probably stick 3 bytes after = it to >> > align sl_seqid.) >> >> How about this? >> >> struct nfsd4_slot { >> - =A0 =A0 =A0 bool =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0sl_inuse; >> - =A0 =A0 =A0 u32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 sl_seqid; >> - =A0 =A0 =A0 struct nfsd4_cache_entry =A0 =A0 =A0 =A0sl_cache_entry= ; >> + =A0 =A0 =A0 bool =A0 =A0sl_inuse; >> + =A0 =A0 =A0 bool =A0 =A0sl_cachethis; >> + =A0 =A0 =A0 u16 =A0 =A0 sl_opcnt; >> + =A0 =A0 =A0 u32 =A0 =A0 sl_seqid; >> + =A0 =A0 =A0 __be32 =A0sl_status; >> + =A0 =A0 =A0 u32 =A0 =A0 sl_datalen; >> + =A0 =A0 =A0 char =A0 =A0sl_data[]; >> =A0}; > > Looks OK to me! =A0By the way, on recent distro's you should be able = to > run apt-get install/yum install dwarves, then > > =A0 =A0 =A0 =A0pahole fs/nfsd/nfsd.o > > and look through the output for the struct of interest: > > struct nfsd4_slot { > =A0 =A0 =A0 =A0bool =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 sl_in= use; =A0 =A0 =A0 =A0 =A0 =A0 /* =A0 =A0 0 =A0 =A0 1 */ > > =A0 =A0 =A0 =A0/* XXX 3 bytes hole, try to pack */ > > =A0 =A0 =A0 =A0u32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0sl_= seqid; =A0 =A0 =A0 =A0 =A0 =A0 /* =A0 =A0 4 =A0 =A0 4 */ > =A0 =A0 =A0 =A0struct nfsd4_cache_entry =A0 sl_cache_entry; =A0 =A0 =A0= /* =A0 =A0 8 =A0 =A036 */ > > =A0 =A0 =A0 =A0/* size: 44, cachelines: 1 */ > =A0 =A0 =A0 =A0/* sum members: 41, holes: 1, sum holes: 3 */ > =A0 =A0 =A0 =A0/* last cacheline: 44 bytes */ > }; =A0 =A0 =A0/* definitions: 3 */ > > (But no point obsessing over this kind of thing; I just happened to > notice the problem in this patch and figured for such a common struct= ure > it might be worth mentioning.) Cool! - thanks for the pointer -->Andy > > --b. > >> >> -->Andy >> >> > --b. >> > >> >> =A0}; >> >> >> >> =A0struct nfsd4_channel_attrs { >> >> @@ -159,7 +152,7 @@ struct nfsd4_session { >> >> =A0 =A0 =A0 struct nfs4_sessionid =A0 se_sessionid; >> >> =A0 =A0 =A0 struct nfsd4_channel_attrs se_fchannel; >> >> =A0 =A0 =A0 struct nfsd4_channel_attrs se_bchannel; >> >> - =A0 =A0 struct nfsd4_slot =A0 =A0 =A0 se_slots[]; =A0 =A0 /* fo= rward channel slots */ >> >> + =A0 =A0 struct nfsd4_slot =A0 =A0 =A0 *se_slots[]; =A0 =A0/* fo= rward channel slots */ >> >> =A0}; >> >> >> >> =A0static inline void >> >> diff --git a/include/linux/nfsd/xdr4.h b/include/linux/nfsd/xdr4.= h >> >> index 3f71660..73164c2 100644 >> >> --- a/include/linux/nfsd/xdr4.h >> >> +++ b/include/linux/nfsd/xdr4.h >> >> @@ -51,7 +51,7 @@ struct nfsd4_compound_state { >> >> =A0 =A0 =A0 /* For sessions DRC */ >> >> =A0 =A0 =A0 struct nfsd4_session =A0 =A0*session; >> >> =A0 =A0 =A0 struct nfsd4_slot =A0 =A0 =A0 *slot; >> >> - =A0 =A0 __be32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*statp; >> >> + =A0 =A0 __be32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*datap; >> >> =A0 =A0 =A0 size_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0iovlen; >> >> =A0 =A0 =A0 u32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 minorvers= ion; >> >> =A0 =A0 =A0 u32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 status; >> >> @@ -472,8 +472,7 @@ static inline bool nfsd4_is_solo_sequence(str= uct nfsd4_compoundres *resp) >> >> >> >> =A0static inline bool nfsd4_not_cached(struct nfsd4_compoundres *= resp) >> >> =A0{ >> >> - =A0 =A0 return !resp->cstate.slot->sl_cache_entry.ce_cachethis = || >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsd4_is_solo_sequence(= resp); >> >> + =A0 =A0 return !resp->cstate.slot->sl_cachethis || nfsd4_is_sol= o_sequence(resp); >> >> =A0} >> >> >> >> =A0#define NFS4_SVC_XDRSIZE =A0 =A0 =A0 =A0 =A0 =A0 sizeof(struct= nfsd4_compoundargs) >> >> -- >> >> 1.6.2.5 >> >> >> > _______________________________________________ >> > pNFS mailing list >> > pNFS@linux-nfs.org >> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs >> > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage 2009-08-27 16:07 ` [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage andros 2009-08-27 16:07 ` [PATCH 3/5] nfsd41: use session maxreqs for sequence target and highest slotid andros @ 2009-08-27 17:12 ` William A. (Andy) Adamson 2009-08-28 20:41 ` J. Bruce Fields 2 siblings, 0 replies; 23+ messages in thread From: William A. (Andy) Adamson @ 2009-08-27 17:12 UTC (permalink / raw) To: bfields; +Cc: linux-nfs, pnfs, Andy Adamson On Thu, Aug 27, 2009 at 12:07 PM, <andros@netapp.com> wrote: > From: Andy Adamson <andros@netapp.com> > > By using the requested ca_maxresponsesize_cached * ca_maxresponses to= bound > a forechannel drc request size, clients can tailor a session to usage= =2E > > For example, an I/O session (READ/WRITE only) can have a much smaller > ca_maxresponsesize_cached (for only WRITE compound responses) and a l= ot larger > ca_maxresponses to service a large in-flight data window. > > Signed-off-by: Andy Adamson <andros@netapp.com> > --- > =A0fs/nfsd/nfs4state.c =A0 =A0 =A0 =A0| =A0 60 ++++++++++++++++++++++= +++++++++------------ > =A0include/linux/nfsd/state.h | =A0 =A08 ++++- > =A02 files changed, 49 insertions(+), 19 deletions(-) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index ddfd36f..a691139 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -414,34 +414,60 @@ gen_sessionid(struct nfsd4_session *ses) > =A0} > > =A0/* > - * Give the client the number of slots it requests bound by > - * NFSD_MAX_SLOTS_PER_SESSION and by nfsd_drc_max_mem. > + * 32 bytes of RPC header and 44 bytes of sequence operation respons= e > + * not included in NFSD_SLOT_CACHE_SIZE > + * */ > +#define NFSD_MIN_HDR_SEQ_SZ =A0(32 + 44) > + > +/* > + * Give the client the number of ca_maxresponsesize_cached slots it = requests > + * bounded by NFSD_SLOT_CACHE_SIZE, NFSD_MAX_MEM_PER_SESSION and by > + * nfsd_drc_max_mem. Do not allow more than NFSD_MAX_SLOTS_PER_SESSI= ON. > + * > + * The ca_maxresponssize_cached definition includes the size > + * of the rpc header with the variable length security flavor creden= tial > + * plus verifier (32 bytes with AUTH_SYS and NULL verifier) > + * as well as the encoded SEQUENCE operation response (44 bytes) > + * which are not included in NFSD_SLOT_CACHE_SIZE. > + * We err on the side of being a bit small when AUTH_SYS/NULL verifi= er > + * is not used. > =A0* > =A0* If we run out of reserved DRC memory we should (up to a point) r= e-negotiate > =A0* active sessions and reduce their slot usage to make rooom for ne= w > =A0* connections. For now we just fail the create session. > =A0*/ > -static int set_forechannel_maxreqs(struct nfsd4_channel_attrs *fchan= ) > +static int set_forechannel_drc_size(struct nfsd4_channel_attrs *fcha= n) > =A0{ > - =A0 =A0 =A0 int mem; > + =A0 =A0 =A0 int mem, size =3D fchan->maxresp_cached - NFSD_MIN_HDR_= SEQ_SZ; > > - =A0 =A0 =A0 if (fchan->maxreqs < 1) > + =A0 =A0 =A0 if (fchan->maxreqs < 1 || size <=3D 0) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return nfserr_inval; > - =A0 =A0 =A0 else if (fchan->maxreqs > NFSD_MAX_SLOTS_PER_SESSION) > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 fchan->maxreqs =3D NFSD_MAX_SLOTS_PER_S= ESSION; > > - =A0 =A0 =A0 mem =3D fchan->maxreqs * NFSD_SLOT_CACHE_SIZE; > + =A0 =A0 =A0 if (size > NFSD_SLOT_CACHE_SIZE) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 size =3D NFSD_SLOT_CACHE_SIZE; > + > + =A0 =A0 =A0 /* bound the maxreqs by NFSD_MAX_MEM_PER_SESSION */ > + =A0 =A0 =A0 mem =3D fchan->maxreqs * size; > + =A0 =A0 =A0 if (mem > NFSD_MAX_MEM_PER_SESSION) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 fchan->maxreqs =3D NFSD_MAX_MEM_PER_SES= SION / size; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (fchan->maxreqs > NFSD_MAX_SLOTS_PER= _SESSION) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 fchan->maxreqs =3D NFSD= _MAX_SLOTS_PER_SESSION; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 mem =3D fchan->maxreqs * size; > + =A0 =A0 =A0 } > > =A0 =A0 =A0 =A0spin_lock(&nfsd_drc_lock); > - =A0 =A0 =A0 if (mem + nfsd_drc_mem_used > nfsd_drc_max_mem) > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 mem =3D ((nfsd_drc_max_mem - nfsd_drc_m= em_used) / > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 NFSD_SL= OT_CACHE_SIZE) * NFSD_SLOT_CACHE_SIZE; > + =A0 =A0 =A0 /* bound the total session drc memory ussage */ > + =A0 =A0 =A0 if (mem + nfsd_drc_mem_used > nfsd_drc_max_mem) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 fchan->maxreqs =3D (nfsd_drc_max_mem - = nfsd_drc_mem_used) / size; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 mem =3D fchan->maxreqs * size; > + =A0 =A0 =A0 } > =A0 =A0 =A0 =A0nfsd_drc_mem_used +=3D mem; > =A0 =A0 =A0 =A0spin_unlock(&nfsd_drc_lock); > > - =A0 =A0 =A0 fchan->maxreqs =3D mem / NFSD_SLOT_CACHE_SIZE; > =A0 =A0 =A0 =A0if (fchan->maxreqs =3D=3D 0) > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 return nfserr_resource; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return nfserr_serverfault; ^^^^^^^^^^^^^^^^^^^ This change belongs in patch [PATCH 4/5] nfsd41: replace nfserr_resource in pure nfs41 responses -->Andy > + > + =A0 =A0 =A0 fchan->maxresp_cached =3D size + NFSD_MIN_HDR_SEQ_SZ; > =A0 =A0 =A0 =A0return 0; > =A0} > > @@ -466,9 +492,6 @@ static int init_forechannel_attrs(struct svc_rqst= *rqstp, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0fchan->maxresp_sz =3D maxcount; > =A0 =A0 =A0 =A0session_fchan->maxresp_sz =3D fchan->maxresp_sz; > > - =A0 =A0 =A0 session_fchan->maxresp_cached =3D NFSD_SLOT_CACHE_SIZE; > - =A0 =A0 =A0 fchan->maxresp_cached =3D session_fchan->maxresp_cached= ; > - > =A0 =A0 =A0 =A0/* Use the client's maxops if possible */ > =A0 =A0 =A0 =A0if (fchan->maxops > NFSD_MAX_OPS_PER_COMPOUND) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0fchan->maxops =3D NFSD_MAX_OPS_PER_COM= POUND; > @@ -478,9 +501,12 @@ static int init_forechannel_attrs(struct svc_rqs= t *rqstp, > =A0 =A0 =A0 =A0 * recover pages from existing sessions. For now fail = session > =A0 =A0 =A0 =A0 * creation. > =A0 =A0 =A0 =A0 */ > - =A0 =A0 =A0 status =3D set_forechannel_maxreqs(fchan); > + =A0 =A0 =A0 status =3D set_forechannel_drc_size(fchan); > > + =A0 =A0 =A0 session_fchan->maxresp_cached =3D fchan->maxresp_cached= ; > =A0 =A0 =A0 =A0session_fchan->maxreqs =3D fchan->maxreqs; > + > + =A0 =A0 =A0 dprintk("%s status %d\n", __func__, status); > =A0 =A0 =A0 =A0return status; > =A0} > > diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h > index fb0c404..ff0b771 100644 > --- a/include/linux/nfsd/state.h > +++ b/include/linux/nfsd/state.h > @@ -92,13 +92,17 @@ struct nfs4_cb_conn { > =A0 =A0 =A0 =A0struct rpc_cred * =A0 =A0 =A0 cb_cred; > =A0}; > > -/* Maximum number of slots per session. 128 is useful for long haul = TCP */ > -#define NFSD_MAX_SLOTS_PER_SESSION =A0 =A0 128 > +/* Maximum number of slots per session. 160 is useful for long haul = TCP */ > +#define NFSD_MAX_SLOTS_PER_SESSION =A0 =A0 160 > =A0/* Maximum number of pages per slot cache entry */ > =A0#define NFSD_PAGES_PER_SLOT =A0 =A01 > =A0#define NFSD_SLOT_CACHE_SIZE =A0 =A0 =A0 =A0 =A0 PAGE_SIZE > =A0/* Maximum number of operations per session compound */ > =A0#define NFSD_MAX_OPS_PER_COMPOUND =A0 =A0 =A016 > +/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */ > +#define NFSD_CACHE_SIZE_SLOTS_PER_SESSION =A0 =A0 =A032 > +#define NFSD_MAX_MEM_PER_SESSION =A0\ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 (NFSD_CACHE_SIZE_SLOTS_PER_SESSION * NF= SD_SLOT_CACHE_SIZE) > > =A0struct nfsd4_cache_entry { > =A0 =A0 =A0 =A0__be32 =A0 =A0 =A0 =A0 =A0ce_status; > -- > 1.6.2.5 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at =A0http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage 2009-08-27 16:07 ` [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage andros 2009-08-27 16:07 ` [PATCH 3/5] nfsd41: use session maxreqs for sequence target and highest slotid andros 2009-08-27 17:12 ` [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage William A. (Andy) Adamson @ 2009-08-28 20:41 ` J. Bruce Fields 2009-08-28 20:56 ` J. Bruce Fields 2009-08-30 6:38 ` [pnfs] " Benny Halevy 2 siblings, 2 replies; 23+ messages in thread From: J. Bruce Fields @ 2009-08-28 20:41 UTC (permalink / raw) To: andros; +Cc: linux-nfs, pnfs Looks basically fine, but there are a few nitpicky problems: On Thu, Aug 27, 2009 at 12:07:41PM -0400, andros@netapp.com wrote: > From: Andy Adamson <andros@netapp.com> > > By using the requested ca_maxresponsesize_cached * ca_maxresponses to bound > a forechannel drc request size, clients can tailor a session to usage. > > For example, an I/O session (READ/WRITE only) can have a much smaller > ca_maxresponsesize_cached (for only WRITE compound responses) and a lot larger > ca_maxresponses to service a large in-flight data window. > > Signed-off-by: Andy Adamson <andros@netapp.com> > --- > fs/nfsd/nfs4state.c | 60 +++++++++++++++++++++++++++++++------------ > include/linux/nfsd/state.h | 8 ++++- > 2 files changed, 49 insertions(+), 19 deletions(-) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index ddfd36f..a691139 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -414,34 +414,60 @@ gen_sessionid(struct nfsd4_session *ses) > } > > /* > - * Give the client the number of slots it requests bound by > - * NFSD_MAX_SLOTS_PER_SESSION and by nfsd_drc_max_mem. > + * 32 bytes of RPC header and 44 bytes of sequence operation response > + * not included in NFSD_SLOT_CACHE_SIZE > + * */ > +#define NFSD_MIN_HDR_SEQ_SZ (32 + 44) I took a look at a trace and got 24 for the rpc header (xid through accept state), 12 for compound header (status, empty tag, opcount), and 44 for the sequence response (opcode through status flags), 80 together--could you double check this? > + > +/* > + * Give the client the number of ca_maxresponsesize_cached slots it requests > + * bounded by NFSD_SLOT_CACHE_SIZE, NFSD_MAX_MEM_PER_SESSION and by > + * nfsd_drc_max_mem. Do not allow more than NFSD_MAX_SLOTS_PER_SESSION. > + * > + * The ca_maxresponssize_cached definition includes the size > + * of the rpc header with the variable length security flavor credential > + * plus verifier (32 bytes with AUTH_SYS and NULL verifier) Note there's no credential in the response, just a verifier. My attempt at these two comments: /* * The protocol defines ca_maxresponssize_cached to include the size of * the rpc header, but all we need to cache is the data starting after * the end of the initial SEQUENCE operation--the rest we regenerate * each time. Therefore we can advertise a ca_maxresponssize_cached * value that is the number of bytes in our cache plus a few additional * bytes. In order to stay on the safe side, and not promise more than * we can cache, those additional bytes must be the minimum possible: 24 * bytes of rpc header (xid through accept state, with AUTH_NULL * verifier), 12 for the compound header (with zero-length tag), and 44 * for the SEQUENCE op response: */ #define NFSD_MIN_HDR_SEQ_SZ (24 + 12 + 44) /* * Give the client the number of ca_maxresponsesize_cached slots it * requests, of size bounded by NFSD_SLOT_CACHE_SIZE, * NFSD_MAX_MEM_PER_SESSION, and nfsd_drc_max_mem. Do not allow more * than NFSD_MAX_SLOTS_PER_SESSION. * * If we run out of reserved DRC memory we should (up to a point) * re-negotiate active sessions and reduce their slot usage to make * rooom for new connections. For now we just fail the create session. */ Does that look right? > + * as well as the encoded SEQUENCE operation response (44 bytes) > + * which are not included in NFSD_SLOT_CACHE_SIZE. > + * We err on the side of being a bit small when AUTH_SYS/NULL verifier > + * is not used. > * > * If we run out of reserved DRC memory we should (up to a point) re-negotiate > * active sessions and reduce their slot usage to make rooom for new > * connections. For now we just fail the create session. > */ > -static int set_forechannel_maxreqs(struct nfsd4_channel_attrs *fchan) > +static int set_forechannel_drc_size(struct nfsd4_channel_attrs *fchan) > { > - int mem; > + int mem, size = fchan->maxresp_cached - NFSD_MIN_HDR_SEQ_SZ; > > - if (fchan->maxreqs < 1) > + if (fchan->maxreqs < 1 || size <= 0) > return nfserr_inval; Is it really illegal for the client to request a maxresp_cached less than NFSD_MIN_HDR_SEQ_SIZE? I think this should just be rounded up to zero. Also watch out for overflow: if the client gives an extremely large value for maxresp_cached then we should round it down rather than doing arithmetic that might result in overflow. Simplest might be to first round maxresp_cached up to NFSD_MIN_HDR_SEQ_SIZE, if it's less. *Then* subtract NFSD_MIN_HDR_SEQ_SIZE. Then round down to NFSD_SLOT_CACHE_SIZE, if it's too large. And keep all of this signed. > - else if (fchan->maxreqs > NFSD_MAX_SLOTS_PER_SESSION) > - fchan->maxreqs = NFSD_MAX_SLOTS_PER_SESSION; > > - mem = fchan->maxreqs * NFSD_SLOT_CACHE_SIZE; > + if (size > NFSD_SLOT_CACHE_SIZE) > + size = NFSD_SLOT_CACHE_SIZE; > + > + /* bound the maxreqs by NFSD_MAX_MEM_PER_SESSION */ > + mem = fchan->maxreqs * size; > + if (mem > NFSD_MAX_MEM_PER_SESSION) { > + fchan->maxreqs = NFSD_MAX_MEM_PER_SESSION / size; > + if (fchan->maxreqs > NFSD_MAX_SLOTS_PER_SESSION) > + fchan->maxreqs = NFSD_MAX_SLOTS_PER_SESSION; > + mem = fchan->maxreqs * size; > + } > > spin_lock(&nfsd_drc_lock); > - if (mem + nfsd_drc_mem_used > nfsd_drc_max_mem) > - mem = ((nfsd_drc_max_mem - nfsd_drc_mem_used) / > - NFSD_SLOT_CACHE_SIZE) * NFSD_SLOT_CACHE_SIZE; > + /* bound the total session drc memory ussage */ > + if (mem + nfsd_drc_mem_used > nfsd_drc_max_mem) { > + fchan->maxreqs = (nfsd_drc_max_mem - nfsd_drc_mem_used) / size; > + mem = fchan->maxreqs * size; > + } > nfsd_drc_mem_used += mem; > spin_unlock(&nfsd_drc_lock); > > - fchan->maxreqs = mem / NFSD_SLOT_CACHE_SIZE; > if (fchan->maxreqs == 0) > - return nfserr_resource; > + return nfserr_serverfault; Remind me why serverfault and not resource here? --b. > + > + fchan->maxresp_cached = size + NFSD_MIN_HDR_SEQ_SZ; > return 0; > } > > @@ -466,9 +492,6 @@ static int init_forechannel_attrs(struct svc_rqst *rqstp, > fchan->maxresp_sz = maxcount; > session_fchan->maxresp_sz = fchan->maxresp_sz; > > - session_fchan->maxresp_cached = NFSD_SLOT_CACHE_SIZE; > - fchan->maxresp_cached = session_fchan->maxresp_cached; > - > /* Use the client's maxops if possible */ > if (fchan->maxops > NFSD_MAX_OPS_PER_COMPOUND) > fchan->maxops = NFSD_MAX_OPS_PER_COMPOUND; > @@ -478,9 +501,12 @@ static int init_forechannel_attrs(struct svc_rqst *rqstp, > * recover pages from existing sessions. For now fail session > * creation. > */ > - status = set_forechannel_maxreqs(fchan); > + status = set_forechannel_drc_size(fchan); > > + session_fchan->maxresp_cached = fchan->maxresp_cached; > session_fchan->maxreqs = fchan->maxreqs; > + > + dprintk("%s status %d\n", __func__, status); > return status; > } > > diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h > index fb0c404..ff0b771 100644 > --- a/include/linux/nfsd/state.h > +++ b/include/linux/nfsd/state.h > @@ -92,13 +92,17 @@ struct nfs4_cb_conn { > struct rpc_cred * cb_cred; > }; > > -/* Maximum number of slots per session. 128 is useful for long haul TCP */ > -#define NFSD_MAX_SLOTS_PER_SESSION 128 > +/* Maximum number of slots per session. 160 is useful for long haul TCP */ > +#define NFSD_MAX_SLOTS_PER_SESSION 160 > /* Maximum number of pages per slot cache entry */ > #define NFSD_PAGES_PER_SLOT 1 > #define NFSD_SLOT_CACHE_SIZE PAGE_SIZE > /* Maximum number of operations per session compound */ > #define NFSD_MAX_OPS_PER_COMPOUND 16 > +/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */ > +#define NFSD_CACHE_SIZE_SLOTS_PER_SESSION 32 > +#define NFSD_MAX_MEM_PER_SESSION \ > + (NFSD_CACHE_SIZE_SLOTS_PER_SESSION * NFSD_SLOT_CACHE_SIZE) > > struct nfsd4_cache_entry { > __be32 ce_status; > -- > 1.6.2.5 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage 2009-08-28 20:41 ` J. Bruce Fields @ 2009-08-28 20:56 ` J. Bruce Fields 2009-08-30 6:38 ` [pnfs] " Benny Halevy 1 sibling, 0 replies; 23+ messages in thread From: J. Bruce Fields @ 2009-08-28 20:56 UTC (permalink / raw) To: andros; +Cc: linux-nfs, pnfs On Fri, Aug 28, 2009 at 04:41:16PM -0400, bfields wrote: > Looks basically fine, but there are a few nitpicky problems: > > On Thu, Aug 27, 2009 at 12:07:41PM -0400, andros@netapp.com wrote: > > From: Andy Adamson <andros@netapp.com> > > > > By using the requested ca_maxresponsesize_cached * ca_maxresponses to bound > > a forechannel drc request size, clients can tailor a session to usage. > > > > For example, an I/O session (READ/WRITE only) can have a much smaller > > ca_maxresponsesize_cached (for only WRITE compound responses) and a lot larger > > ca_maxresponses to service a large in-flight data window. > > > > Signed-off-by: Andy Adamson <andros@netapp.com> > > --- > > fs/nfsd/nfs4state.c | 60 +++++++++++++++++++++++++++++++------------ > > include/linux/nfsd/state.h | 8 ++++- > > 2 files changed, 49 insertions(+), 19 deletions(-) > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > index ddfd36f..a691139 100644 > > --- a/fs/nfsd/nfs4state.c > > +++ b/fs/nfsd/nfs4state.c > > @@ -414,34 +414,60 @@ gen_sessionid(struct nfsd4_session *ses) > > } > > > > /* > > - * Give the client the number of slots it requests bound by > > - * NFSD_MAX_SLOTS_PER_SESSION and by nfsd_drc_max_mem. > > + * 32 bytes of RPC header and 44 bytes of sequence operation response > > + * not included in NFSD_SLOT_CACHE_SIZE > > + * */ > > +#define NFSD_MIN_HDR_SEQ_SZ (32 + 44) > > I took a look at a trace and got 24 for the rpc header (xid through > accept state), 12 for compound header (status, empty tag, opcount), and > 44 for the sequence response (opcode through status flags), 80 > together--could you double check this? > > > + > > +/* > > + * Give the client the number of ca_maxresponsesize_cached slots it requests > > + * bounded by NFSD_SLOT_CACHE_SIZE, NFSD_MAX_MEM_PER_SESSION and by > > + * nfsd_drc_max_mem. Do not allow more than NFSD_MAX_SLOTS_PER_SESSION. > > + * > > + * The ca_maxresponssize_cached definition includes the size > > + * of the rpc header with the variable length security flavor credential > > + * plus verifier (32 bytes with AUTH_SYS and NULL verifier) > > Note there's no credential in the response, just a verifier. My > attempt at these two comments: > > /* > * The protocol defines ca_maxresponssize_cached to include the size of > * the rpc header, but all we need to cache is the data starting after > * the end of the initial SEQUENCE operation--the rest we regenerate > * each time. Therefore we can advertise a ca_maxresponssize_cached > * value that is the number of bytes in our cache plus a few additional > * bytes. In order to stay on the safe side, and not promise more than > * we can cache, those additional bytes must be the minimum possible: 24 > * bytes of rpc header (xid through accept state, with AUTH_NULL > * verifier), 12 for the compound header (with zero-length tag), and 44 > * for the SEQUENCE op response: > */ > #define NFSD_MIN_HDR_SEQ_SZ (24 + 12 + 44) > > /* > * Give the client the number of ca_maxresponsesize_cached slots it > * requests, of size bounded by NFSD_SLOT_CACHE_SIZE, > * NFSD_MAX_MEM_PER_SESSION, and nfsd_drc_max_mem. Do not allow more > * than NFSD_MAX_SLOTS_PER_SESSION. > * > * If we run out of reserved DRC memory we should (up to a point) > * re-negotiate active sessions and reduce their slot usage to make > * rooom for new connections. For now we just fail the create session. > */ > > Does that look right? > > > + * as well as the encoded SEQUENCE operation response (44 bytes) > > + * which are not included in NFSD_SLOT_CACHE_SIZE. > > + * We err on the side of being a bit small when AUTH_SYS/NULL verifier > > + * is not used. > > * > > * If we run out of reserved DRC memory we should (up to a point) re-negotiate > > * active sessions and reduce their slot usage to make rooom for new > > * connections. For now we just fail the create session. > > */ > > -static int set_forechannel_maxreqs(struct nfsd4_channel_attrs *fchan) > > +static int set_forechannel_drc_size(struct nfsd4_channel_attrs *fchan) > > { > > - int mem; > > + int mem, size = fchan->maxresp_cached - NFSD_MIN_HDR_SEQ_SZ; > > > > - if (fchan->maxreqs < 1) > > + if (fchan->maxreqs < 1 || size <= 0) > > return nfserr_inval; > > Is it really illegal for the client to request a maxresp_cached less than > NFSD_MIN_HDR_SEQ_SIZE? I think this should just be rounded up to zero. > > Also watch out for overflow: if the client gives an extremely large > value for maxresp_cached then we should round it down rather than doing > arithmetic that might result in overflow. > > Simplest might be to first round maxresp_cached up to > NFSD_MIN_HDR_SEQ_SIZE, if it's less. *Then* subtract > NFSD_MIN_HDR_SEQ_SIZE. Then round down to NFSD_SLOT_CACHE_SIZE, if it's > too large. And keep all of this signed. > > > - else if (fchan->maxreqs > NFSD_MAX_SLOTS_PER_SESSION) > > - fchan->maxreqs = NFSD_MAX_SLOTS_PER_SESSION; > > > > - mem = fchan->maxreqs * NFSD_SLOT_CACHE_SIZE; > > + if (size > NFSD_SLOT_CACHE_SIZE) > > + size = NFSD_SLOT_CACHE_SIZE; > > + > > + /* bound the maxreqs by NFSD_MAX_MEM_PER_SESSION */ > > + mem = fchan->maxreqs * size; > > + if (mem > NFSD_MAX_MEM_PER_SESSION) { > > + fchan->maxreqs = NFSD_MAX_MEM_PER_SESSION / size; > > + if (fchan->maxreqs > NFSD_MAX_SLOTS_PER_SESSION) > > + fchan->maxreqs = NFSD_MAX_SLOTS_PER_SESSION; > > + mem = fchan->maxreqs * size; > > + } > > > > spin_lock(&nfsd_drc_lock); > > - if (mem + nfsd_drc_mem_used > nfsd_drc_max_mem) > > - mem = ((nfsd_drc_max_mem - nfsd_drc_mem_used) / > > - NFSD_SLOT_CACHE_SIZE) * NFSD_SLOT_CACHE_SIZE; > > + /* bound the total session drc memory ussage */ > > + if (mem + nfsd_drc_mem_used > nfsd_drc_max_mem) { > > + fchan->maxreqs = (nfsd_drc_max_mem - nfsd_drc_mem_used) / size; > > + mem = fchan->maxreqs * size; > > + } > > nfsd_drc_mem_used += mem; > > spin_unlock(&nfsd_drc_lock); > > > > - fchan->maxreqs = mem / NFSD_SLOT_CACHE_SIZE; > > if (fchan->maxreqs == 0) > > - return nfserr_resource; > > + return nfserr_serverfault; > > Remind me why serverfault and not resource here? Whoops, OK, I see that one's answered later--it's not a legal error any more. --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [pnfs] [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage 2009-08-28 20:41 ` J. Bruce Fields 2009-08-28 20:56 ` J. Bruce Fields @ 2009-08-30 6:38 ` Benny Halevy 1 sibling, 0 replies; 23+ messages in thread From: Benny Halevy @ 2009-08-30 6:38 UTC (permalink / raw) To: J. Bruce Fields; +Cc: andros, linux-nfs, pnfs On Aug. 28, 2009, 23:41 +0300, "J. Bruce Fields" <bfields@fieldses.org> wrote: > Remind me why serverfault and not resource here? > > --b. > NFS4ERR_RESOURCE is not a valid nfsv4.1 error. Benny ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/5] nfsd41: expand solo sequence check 2009-08-27 16:07 ` [PATCH 1/5] nfsd41: expand solo sequence check andros 2009-08-27 16:07 ` [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage andros @ 2009-08-27 21:42 ` J. Bruce Fields 1 sibling, 0 replies; 23+ messages in thread From: J. Bruce Fields @ 2009-08-27 21:42 UTC (permalink / raw) To: andros; +Cc: linux-nfs, pnfs On Thu, Aug 27, 2009 at 12:07:40PM -0400, andros@netapp.com wrote: > From: Andy Adamson <andros@netapp.com> > > Failed sequenece == solo sequence. > Remove redundant sequence operation cache checks. Applied, thanks. --b. > > Signed-off-by: Andy Adamson <andros@netapp.com> > --- > fs/nfsd/nfs4state.c | 9 --------- > include/linux/nfsd/xdr4.h | 2 +- > 2 files changed, 1 insertions(+), 10 deletions(-) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index d2a0524..ddfd36f 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -991,16 +991,10 @@ nfsd4_store_cache_entry(struct nfsd4_compoundres *resp) > { > struct nfsd4_cache_entry *entry = &resp->cstate.slot->sl_cache_entry; > struct svc_rqst *rqstp = resp->rqstp; > - struct nfsd4_compoundargs *args = rqstp->rq_argp; > - struct nfsd4_op *op = &args->ops[resp->opcnt]; > struct kvec *resv = &rqstp->rq_res.head[0]; > > dprintk("--> %s entry %p\n", __func__, entry); > > - /* Don't cache a failed OP_SEQUENCE. */ > - if (resp->opcnt == 1 && op->opnum == OP_SEQUENCE && resp->cstate.status) > - return; > - > nfsd4_release_respages(entry->ce_respages, entry->ce_resused); > entry->ce_opcnt = resp->opcnt; > entry->ce_status = resp->cstate.status; > @@ -1490,9 +1484,6 @@ nfsd4_sequence(struct svc_rqst *rqstp, > slot->sl_inuse = true; > slot->sl_seqid = seq->seqid; > slot->sl_cache_entry.ce_cachethis = seq->cachethis; > - /* Always set the cache entry cachethis for solo sequence */ > - if (nfsd4_is_solo_sequence(resp)) > - slot->sl_cache_entry.ce_cachethis = 1; > > cstate->slot = slot; > cstate->session = session; > diff --git a/include/linux/nfsd/xdr4.h b/include/linux/nfsd/xdr4.h > index 5e4beb0..3f71660 100644 > --- a/include/linux/nfsd/xdr4.h > +++ b/include/linux/nfsd/xdr4.h > @@ -467,7 +467,7 @@ struct nfsd4_compoundres { > static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp) > { > struct nfsd4_compoundargs *args = resp->rqstp->rq_argp; > - return args->opcnt == 1; > + return resp->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE; > } > > static inline bool nfsd4_not_cached(struct nfsd4_compoundres *resp) > -- > 1.6.2.5 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] NFSv4.1 DRC rewrite version 6 2009-08-27 16:07 [PATCH 0/5] NFSv4.1 DRC rewrite version 6 andros 2009-08-27 16:07 ` [PATCH 1/5] nfsd41: expand solo sequence check andros @ 2009-08-28 21:34 ` J. Bruce Fields 2009-08-28 22:42 ` J. Bruce Fields 2 siblings, 0 replies; 23+ messages in thread From: J. Bruce Fields @ 2009-08-28 21:34 UTC (permalink / raw) To: andros; +Cc: linux-nfs, pnfs On Thu, Aug 27, 2009 at 12:07:39PM -0400, andros@netapp.com wrote: > > NFSv4.1 DRC Rewrite Version 6 > > These patches apply against git://linux-nfs.org/~bfields/linux: for-2.6.32 > branch and continue to rewrite the NFSv4.1 Sessions DRC. Besides some > bug fixes: Thanks for following up on this. It looks fine except for a few nits noted on replies to individual patches. Could you fix those up and resend? And then I think #7 should be it.... --b. > > 1) The bound on the fore channel per-session DRC size are rewritten. > Instead of just using a maximum number of slots to bound the size, > The server gives the client the number of ca_maxresponsesize_cached slots it > requests bounded by NFSD_SLOT_CACHE_SIZE, NFSD_MAX_MEM_PER_SESSION and by > nfsd_drc_max_mem. Do not allow more than NFSD_MAX_SLOTS_PER_SESSION. > > This allows clients to tailor a session to usage. > For example, an I/O session (READ/WRITE/COMMIT only) can have a much smaller > ca_maxresponsesize_cached (for only WRITE/COMMIt compound responses) and a lot > larger ca_maxresponses to service a large in-flight data window. > > 2) the page-based DRC is replaced with a buffer based DRC with each > slot table entry (struct nfsd4_slot + cache) allocated separately. > This allocation prepares us for slot size re-negotiation via the SEQUENCE > operation target and high slot id arguments. > > Testing: > > NFSv4.1 mount: pynfs tests - including the SEQUENCE replay cache tests. > connectathon tests. > > NFSv4.0 mount: connectathon tests. > > -->Andy > > 0001-nfsd41-expand-solo-sequence-check.patch > 0002-nfsd41-bound-forechannel-drc-size-by-memory-usage.patch > 0003-nfsd41-use-session-maxreqs-for-sequence-target-and.patch > 0004-nfsd41-replace-nfserr_resource-in-pure-nfs41-respon.patch > 0005-nfsd41-replace-page-based-DRC-with-buffer-based-DRC.patch > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] NFSv4.1 DRC rewrite version 6 2009-08-27 16:07 [PATCH 0/5] NFSv4.1 DRC rewrite version 6 andros 2009-08-27 16:07 ` [PATCH 1/5] nfsd41: expand solo sequence check andros 2009-08-28 21:34 ` [PATCH 0/5] NFSv4.1 DRC rewrite version 6 J. Bruce Fields @ 2009-08-28 22:42 ` J. Bruce Fields 2009-08-28 22:56 ` [pnfs] " J. Bruce Fields 2 siblings, 1 reply; 23+ messages in thread From: J. Bruce Fields @ 2009-08-28 22:42 UTC (permalink / raw) To: andros; +Cc: linux-nfs, pnfs On Thu, Aug 27, 2009 at 12:07:39PM -0400, andros@netapp.com wrote: > > NFSv4.1 DRC Rewrite Version 6 > > These patches apply against git://linux-nfs.org/~bfields/linux: for-2.6.32 > branch and continue to rewrite the NFSv4.1 Sessions DRC. Besides some > bug fixes: > > 1) The bound on the fore channel per-session DRC size are rewritten. > Instead of just using a maximum number of slots to bound the size, > The server gives the client the number of ca_maxresponsesize_cached slots it > requests bounded by NFSD_SLOT_CACHE_SIZE, NFSD_MAX_MEM_PER_SESSION and by > nfsd_drc_max_mem. Do not allow more than NFSD_MAX_SLOTS_PER_SESSION. > > This allows clients to tailor a session to usage. > For example, an I/O session (READ/WRITE/COMMIT only) can have a much smaller > ca_maxresponsesize_cached (for only WRITE/COMMIt compound responses) and a lot > larger ca_maxresponses to service a large in-flight data window. > > 2) the page-based DRC is replaced with a buffer based DRC with each > slot table entry (struct nfsd4_slot + cache) allocated separately. > This allocation prepares us for slot size re-negotiation via the SEQUENCE > operation target and high slot id arguments. > > Testing: > > NFSv4.1 mount: pynfs tests - including the SEQUENCE replay cache tests. > connectathon tests. By the way, any hints on getting the 4.1 pynfs stuff running? ./nfs4.1/testserver.py pearlet1:/pynfs41test-root --rundeps --maketree Traceback (most recent call last): File "./nfs4.1/testserver.py", line 38, in <module> import server41tests.environment as environment File "/root/pynfs41/nfs4.1/server41tests/environment.py", line 15, in <module> import rpc File "/root/pynfs41/rpc/__init__.py", line 1, in <module> from rpc import * File "/root/pynfs41/rpc/rpc.py", line 14, in <module> import security File "/root/pynfs41/rpc/security.py", line 11, in <module> import gssapi ImportError: No module named gssapi With the 4.0 pynfs I've always run /setup.py build_ext --inplace before. That gives me: This is currently broken...everything should already be set up. Just go into nfs4 and play. The setup.py in nfs4.1/ complains about not finding xdrgen. --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [pnfs] [PATCH 0/5] NFSv4.1 DRC rewrite version 6 2009-08-28 22:42 ` J. Bruce Fields @ 2009-08-28 22:56 ` J. Bruce Fields 2009-08-28 23:07 ` J. Bruce Fields 0 siblings, 1 reply; 23+ messages in thread From: J. Bruce Fields @ 2009-08-28 22:56 UTC (permalink / raw) To: andros; +Cc: linux-nfs, pnfs On Fri, Aug 28, 2009 at 06:42:48PM -0400, J. Bruce Fields wrote: > On Thu, Aug 27, 2009 at 12:07:39PM -0400, andros@netapp.com wrote: > > > > NFSv4.1 DRC Rewrite Version 6 > > > > These patches apply against git://linux-nfs.org/~bfields/linux: for-2.6.32 > > branch and continue to rewrite the NFSv4.1 Sessions DRC. Besides some > > bug fixes: > > > > 1) The bound on the fore channel per-session DRC size are rewritten. > > Instead of just using a maximum number of slots to bound the size, > > The server gives the client the number of ca_maxresponsesize_cached slots it > > requests bounded by NFSD_SLOT_CACHE_SIZE, NFSD_MAX_MEM_PER_SESSION and by > > nfsd_drc_max_mem. Do not allow more than NFSD_MAX_SLOTS_PER_SESSION. > > > > This allows clients to tailor a session to usage. > > For example, an I/O session (READ/WRITE/COMMIT only) can have a much smaller > > ca_maxresponsesize_cached (for only WRITE/COMMIt compound responses) and a lot > > larger ca_maxresponses to service a large in-flight data window. > > > > 2) the page-based DRC is replaced with a buffer based DRC with each > > slot table entry (struct nfsd4_slot + cache) allocated separately. > > This allocation prepares us for slot size re-negotiation via the SEQUENCE > > operation target and high slot id arguments. > > > > Testing: > > > > NFSv4.1 mount: pynfs tests - including the SEQUENCE replay cache tests. > > connectathon tests. > > By the way, any hints on getting the 4.1 pynfs stuff running? OK, solution seemed to be cd gssapi python setup.py build_ext --inplace But now almost everything fails.with SERVERFAULT. Bah. --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [pnfs] [PATCH 0/5] NFSv4.1 DRC rewrite version 6 2009-08-28 22:56 ` [pnfs] " J. Bruce Fields @ 2009-08-28 23:07 ` J. Bruce Fields 2009-09-08 14:43 ` J. Bruce Fields 0 siblings, 1 reply; 23+ messages in thread From: J. Bruce Fields @ 2009-08-28 23:07 UTC (permalink / raw) To: andros; +Cc: linux-nfs, pnfs On Fri, Aug 28, 2009 at 06:56:08PM -0400, bfields wrote: > On Fri, Aug 28, 2009 at 06:42:48PM -0400, J. Bruce Fields wrote: > > On Thu, Aug 27, 2009 at 12:07:39PM -0400, andros@netapp.com wrote: > > > > > > NFSv4.1 DRC Rewrite Version 6 > > > > > > These patches apply against git://linux-nfs.org/~bfields/linux: for-2.6.32 > > > branch and continue to rewrite the NFSv4.1 Sessions DRC. Besides some > > > bug fixes: > > > > > > 1) The bound on the fore channel per-session DRC size are rewritten. > > > Instead of just using a maximum number of slots to bound the size, > > > The server gives the client the number of ca_maxresponsesize_cached slots it > > > requests bounded by NFSD_SLOT_CACHE_SIZE, NFSD_MAX_MEM_PER_SESSION and by > > > nfsd_drc_max_mem. Do not allow more than NFSD_MAX_SLOTS_PER_SESSION. > > > > > > This allows clients to tailor a session to usage. > > > For example, an I/O session (READ/WRITE/COMMIT only) can have a much smaller > > > ca_maxresponsesize_cached (for only WRITE/COMMIt compound responses) and a lot > > > larger ca_maxresponses to service a large in-flight data window. > > > > > > 2) the page-based DRC is replaced with a buffer based DRC with each > > > slot table entry (struct nfsd4_slot + cache) allocated separately. > > > This allocation prepares us for slot size re-negotiation via the SEQUENCE > > > operation target and high slot id arguments. > > > > > > Testing: > > > > > > NFSv4.1 mount: pynfs tests - including the SEQUENCE replay cache tests. > > > connectathon tests. > > > > By the way, any hints on getting the 4.1 pynfs stuff running? > > OK, solution seemed to be > > cd gssapi > python setup.py build_ext --inplace > > But now almost everything fails.with SERVERFAULT. Bah. Does anyone have a list of which tests are expected to pass at this point? If I ask for "all" tests, I get "4 skipped, 103 failed, 0 warned, 26 passed--full list available if anyone wants. I seem to be passing connectathon tests OK over 4.1. --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [pnfs] [PATCH 0/5] NFSv4.1 DRC rewrite version 6 2009-08-28 23:07 ` J. Bruce Fields @ 2009-09-08 14:43 ` J. Bruce Fields 0 siblings, 0 replies; 23+ messages in thread From: J. Bruce Fields @ 2009-09-08 14:43 UTC (permalink / raw) To: andros; +Cc: linux-nfs, pnfs On Fri, Aug 28, 2009 at 07:07:27PM -0400, bfields wrote: > On Fri, Aug 28, 2009 at 06:56:08PM -0400, bfields wrote: > > On Fri, Aug 28, 2009 at 06:42:48PM -0400, J. Bruce Fields wrote: > > > By the way, any hints on getting the 4.1 pynfs stuff running? > > > > OK, solution seemed to be > > > > cd gssapi > > python setup.py build_ext --inplace > > > > But now almost everything fails.with SERVERFAULT. Bah. > > Does anyone have a list of which tests are expected to pass at this point? If > I ask for "all" tests, I get "4 skipped, 103 failed, 0 warned, 26 passed--full > list available if anyone wants. I seem to be passing connectathon tests > OK over 4.1. If someone could just send a copy of what they consider succesful (for now) pynfs output, that would be useful. --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2009-09-08 14:43 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-27 16:07 [PATCH 0/5] NFSv4.1 DRC rewrite version 6 andros
2009-08-27 16:07 ` [PATCH 1/5] nfsd41: expand solo sequence check andros
2009-08-27 16:07 ` [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage andros
2009-08-27 16:07 ` [PATCH 3/5] nfsd41: use session maxreqs for sequence target and highest slotid andros
2009-08-27 16:07 ` [PATCH 4/5] nfsd41: replace nfserr_resource in pure nfs41 responses andros
2009-08-27 16:07 ` [PATCH 5/5] nfsd41: replace page based DRC with buffer based DRC andros
2009-08-28 21:33 ` J. Bruce Fields
2009-08-30 23:10 ` [pnfs] " Benny Halevy
2009-08-31 13:08 ` J. Bruce Fields
2009-08-31 13:43 ` William A. (Andy) Adamson
2009-09-01 13:48 ` William A. (Andy) Adamson
[not found] ` <89c397150909010648v4a4f5db8t87e09717e5a2c950-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-09-01 15:14 ` J. Bruce Fields
2009-09-01 15:22 ` William A. (Andy) Adamson
2009-08-27 17:12 ` [PATCH 2/5] nfsd41: bound forechannel drc size by memory usage William A. (Andy) Adamson
2009-08-28 20:41 ` J. Bruce Fields
2009-08-28 20:56 ` J. Bruce Fields
2009-08-30 6:38 ` [pnfs] " Benny Halevy
2009-08-27 21:42 ` [PATCH 1/5] nfsd41: expand solo sequence check J. Bruce Fields
2009-08-28 21:34 ` [PATCH 0/5] NFSv4.1 DRC rewrite version 6 J. Bruce Fields
2009-08-28 22:42 ` J. Bruce Fields
2009-08-28 22:56 ` [pnfs] " J. Bruce Fields
2009-08-28 23:07 ` J. Bruce Fields
2009-09-08 14:43 ` J. Bruce Fields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox