From: Peter Staubach <staubach@redhat.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>,
Jeff Layton <jlayton@redhat.com>,
linux-nfs@vger.kernel.org
Subject: Re: Should truncated READDIR replies return -EIO?
Date: Fri, 08 Feb 2008 13:16:17 -0500 [thread overview]
Message-ID: <47AC9C71.4090306@redhat.com> (raw)
In-Reply-To: <4B54CC40-B164-4B8D-A5D7-74CE2B684955@oracle.com>
Chuck Lever wrote:
> On Feb 8, 2008, at 10:39 AM, Peter Staubach wrote:
>> Trond Myklebust wrote:
>>> On Fri, 2008-02-08 at 10:04 -0500, Jeff Layton wrote:
>>>
>>>> Recently, I ran across a server-side bug that caused the server to
>>>> send
>>>> truncated READDIR replies. The server would send a valid RPC
>>>> response to
>>>> a READDIR call, but the contents of it were basically missing
>>>> (everything after the status).
>>>>
>>>> The server problem had long been patched in mainline kernels, but the
>>>> interesting bit was that clients didn't return an error in this
>>>> situation. The XDR decoders for readdir calls are supposed to check
>>>> the
>>>> validity of the response, but in this situation it just fudges the
>>>> contents of the pagecache to make it look like a completely empty
>>>> directory.
>>>>
>>>> Shouldn't the client return an error in this situation? The response
>>>> obviously isn't valid so it seems like it shouldn't pretend that it
>>>> is.
>>>> If so, would something like the following patch make sense?
>>>>
>>>
>>> It is quite valid (though silly!) for a server to return a READDIR
>>> reply
>>> with no entries. AFAICR there were servers that actually did this at
>>> one
>>> point (though I shall refrain from naming and shaming).
>>>
>>> So whereas I agree that it might be correct to flag a READDIR reply
>>> that
>>> contains no entries due to XDR encoding bugs, I'm not sure that we
>>> should be flagging errors in the case where the XDR is correct.
>>
>> In this case, I believe that the response was malformed. Pretty
>> much everything after the status was missing, including the EOF
>> indicator. I would agree that it would be silly to return a
>> response with no error indicated, no entries, and the eof
>> indication set to false.
>>
>> This really boils down to how do we handle malformed responses?
>> Is there a general policy to retransmit the request? This would
>> seem to be the right thing because a malformed response would
>> result from many things including the TCP connection getting
>> dropped in the middle of receiving the response from a timeout
>> and other things. However, in this situation, retransmitting
>> the request would just have resulted in the same, broken response
>> from the server. This was due to a server bug, which has since
>> been fixed, but exists still out in nature.
>
>
> Replies that are malformed network or RPC level packets are dropped by
> the RPC client, and the matching requests are retransmitted by the RPC
> client after a timeout. Network events (like your TCP connection
> example) result in a malformed RPC level packet that the RPC client
> never delivers to the XDR layer, and are thus retransmitted by the RPC
> client.
>
> Replies that have malformed XDR are treated by the NFS client as
> errors. The problem is the decoders (on Linux) are not terribly
> careful about checking the correctness of the server's XDR encoding,
> especially in cases like READDIR (Not to mention compound RPCs!) where
> the decoding can be complex. Olaf has mentioned the Linux XDR layer
> was hand-coded rather than constructed with rpcgen to keep the
> decoders simple and efficient.
>
> Network-related corruption is likely to be caught by the lower
> layers. I tend to think that malformed XDR is nearly always a genuine
> software defect on the server, and thus not worth retransmitting
> (especially if it's an idempotent request!).
What happens if a response is interrupted in the middle by the
TCP connection being broken? Is this caught at the RPC layer
and then rejected?
Thanx...
ps
next prev parent reply other threads:[~2008-02-08 18:16 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-08 15:04 Should truncated READDIR replies return -EIO? Jeff Layton
2008-02-08 15:13 ` Trond Myklebust
[not found] ` <1202483596.8914.13.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-02-08 15:18 ` Trond Myklebust
[not found] ` <1202483883.10337.2.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-02-08 15:42 ` Peter Staubach
2008-02-08 15:39 ` Peter Staubach
2008-02-08 16:18 ` Jeff Layton
2008-02-08 17:16 ` Chuck Lever
2008-02-08 18:16 ` Peter Staubach [this message]
2008-02-08 19:25 ` Chuck Lever
2008-02-08 15:56 ` Jeff Layton
[not found] ` <20080208105659.3bfb8a6b-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-02-08 16:13 ` Trond Myklebust
[not found] ` <1202487187.10337.25.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-02-08 16:51 ` Jeff Layton
2008-02-12 13:20 ` Jeff Layton
[not found] ` <20080212082038.7e75670e-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-02-13 0:13 ` Trond Myklebust
[not found] ` <1202861583.14707.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-02-19 16:49 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47AC9C71.4090306@redhat.com \
--to=staubach@redhat.com \
--cc=chuck.lever@oracle.com \
--cc=jlayton@redhat.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.