From: Mi Jinlong <mijinlong@cn.fujitsu.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: NFS <linux-nfs@vger.kernel.org>
Subject: Re: [RFC][PATCH v2] nfsd41: try to check reply size before operation
Date: Wed, 24 Aug 2011 17:07:55 +0800 [thread overview]
Message-ID: <4E54BF6B.4020803@cn.fujitsu.com> (raw)
In-Reply-To: <20110823213916.GB25350@fieldses.org>
J. Bruce Fields :
> On Sat, Aug 20, 2011 at 06:11:31PM +0800, Mi Jinlong wrote:
>> For checking the size of reply before calling a operation,
>> we need try to get maxsize of the operation's reply.
>>
>> v1->v2:
>> move op_enc_size from nfsd4_enc_ops to nfsd4_operation;
>> add helper function to get payload len which is need as READ, READDIR ...
>
> I just want to make sure I understand the logic here. So while
> encoding this operation, we estimate the size of the *next* operation:
Yes,
>
>> @@ -1466,6 +1791,22 @@ static const char *nfsd4_op_name(unsigned opnum)
>> return "unknown_operation";
>> }
>>
>> +u32 get_ops_max_respz(struct svc_rqst * rqstp)
>> +{
>> + struct nfsd4_compoundargs *args = rqstp->rq_argp;
>> + struct nfsd4_compoundres *resp = rqstp->rq_resp;
>> + __be32 opnum = args->ops[resp->opcnt].opnum;
>> + __be32 length = 0;
>> +
>> + length = nfsd4_ops[opnum].op_enc_size * 4;
>> + if (nfsd4_ops[opnum].op_payload)
>> + length += nfsd4_ops[opnum].op_payload(rqstp);
>> +
>> + dprintk("%s opnum %u max reply %u\n", __func__, opnum, length);
>> +
>> + return length;
>> +}
>> +
>> #define nfsd4_voidres nfsd4_voidargs
>> struct nfsd4_voidargs { int dummy; };
>
> and we stick the result into the next op's status field:
Yes,
>
>> @@ -3374,10 +3373,14 @@ static int nfsd4_check_drc_limit(struct nfsd4_compoundres *resp)
>> dprintk("%s length %u, xb->page_len %u tlen %u pad %u\n", __func__,
>> length, xb->page_len, tlen, pad);
>>
>> - if (length <= session->se_fchannel.maxresp_cached)
>> - return status;
>> - else
>> - return nfserr_rep_too_big_to_cache;
>> + if (length > session->se_fchannel.maxresp_sz)
>> + args->ops[resp->opcnt].status = nfserr_rep_too_big;
>> +
>> + if (slot->sl_cachethis == 1 &&
>> + length > session->se_fchannel.maxresp_cached)
>> + args->ops[resp->opcnt].status = nfserr_rep_too_big_to_cache;
>> +
>> + return 0;
>> }
>
> and then we check that status and use it when we process the next
> compound:
Yes,
>
>> @@ -1188,7 +1196,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
>> goto encode_op;
>> }
>>
>> - if (opdesc->op_func)
>> + if (op->status == 0 && opdesc->op_func)
>> op->status = opdesc->op_func(rqstp, cstate, &op->u);
>> else
>> BUG_ON(op->status == nfs_ok);
>
> Do I have that right?
Yes, you are right.
>
> Is there some reason we need to do it that way? Why not instead do
> something like:
It sounds great!
I will have a try.
>
> }
>
> + op->status == nfsd4_check_resp_size(resp)
> + if (op->status)
> + goto encode_op;
> if (opdesc->op_func)
> op->status = opdesc->op_func(rqstp-, cstate, &op->u);
>
> in nfsd4_proc_compound.
>
> A minor nitpick (already in the existing code):
>
>> @@ -3336,32 +3336,31 @@ static nfsd4_enc nfsd4_enc_ops[] = {
>> * Calculate the total amount of memory that the compound response has taken
>> * after encoding the current operation.
>> *
>> - * pad: add on 8 bytes for the next operation's op_code and status so that
>> - * there is room to cache a failure on the next operation.
>> - *
>> - * Compare this length to the session se_fmaxresp_cached.
>> + * Compare this length to the session se_fmaxresp_sz and se_fmaxresp_cached.
>> *
>> * Our se_fmaxresp_cached will always be a multiple of PAGE_SIZE, and so
>> * will be at least a page and will therefore hold the xdr_buf head.
>> */
>> -static int nfsd4_check_drc_limit(struct nfsd4_compoundres *resp)
>> +static int nfsd4_check_resp_limit(struct nfsd4_compoundres *resp)
>> {
>> int status = 0;
>
> Status is *always* 0 in this function. Let's just get rid of it.
OK.
>
> Oh boy:
>
>> +static u32 nfsd4_enc_getattr_playload(struct svc_rqst *rqstp)
>> +{
>> + struct nfsd4_compoundargs *args = rqstp->rq_argp;
>> + struct nfsd4_compoundres *resp = rqstp->rq_resp;
>> + struct nfsd4_op op = args->ops[resp->opcnt];
>> + u32 *bmval = op.u.getattr.ga_bmval;
>> + u32 bmval0 = bmval[0];
>> + u32 bmval1 = bmval[1];
>> + u32 bmval2 = bmval[2];
>> + u32 mlen = 0, lc = 0;
>> +
>> + if (bmval2)
>> + mlen += 16;
>> + else if (bmval1)
>> + mlen += 12;
>> + else
>> + mlen += 8;
>> +
>> + if (bmval0 & FATTR4_WORD0_SUPPORTED_ATTRS) {
>> + if (!nfsd_suppattrs2(resp->cstate.minorversion))
>> + mlen += 12;
>> + else
>> + mlen += 16;
>> + }
>> +
>> + if (bmval0 & FATTR4_WORD0_TYPE)
>> + mlen += 4;
>> + if (bmval0 & FATTR4_WORD0_FH_EXPIRE_TYPE)
>> + mlen += 4;
>
>
> This is getting complicated....
Agree,
>
> The thing is, some of the most complicated ops (read, readdir, getattr)
> don't *really* matter that much, because they don't change anything on
> the filesystem, and don't change the server state in any way.
>
> So maybe we could handle operations in two different ways:
>
> - For operations that actually change something (write, rename,
> open, close, ...), do it the way we're doing it now: be
> very careful to estimate the size of the response before even
> processing the operation.
> - For operations that don't change anything (read, getattr, ...)
> just go ahead and do the operation. If you realize after the
> fact that the response is too large, then return the error at
> that point.
>
> So we'd add another flag to op_flags: say, OP_MODIFIES_SOMETHING. And for
> operations with OP_MODIFIES_SOMETHING set, we'd do the first thing. For
> operations without it set, we'd do the second.
>
> Would there be any problem with doing it that way?
No, I will try to do it as that.
Thanks for comment!
--
----
thanks
Mi Jinlong
next prev parent reply other threads:[~2011-08-24 9:03 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-30 8:20 [RFC][PATCH 2/2] nfsd41: try to check reply size before operation Mi Jinlong
2011-07-06 20:19 ` J. Bruce Fields
2011-08-20 10:11 ` [RFC][PATCH v2] " Mi Jinlong
2011-08-23 21:39 ` J. Bruce Fields
2011-08-24 9:07 ` Mi Jinlong [this message]
2011-08-28 10:18 ` [PATCH v3] " Mi Jinlong
2011-09-16 14:30 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E54BF6B.4020803@cn.fujitsu.com \
--to=mijinlong@cn.fujitsu.com \
--cc=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.