From: Jeff Garzik <jeff@garzik.org>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: NFS list <linux-nfs@vger.kernel.org>
Subject: Re: Linux client misses lack of open-confirm?
Date: Sat, 22 Dec 2007 21:05:44 -0500 [thread overview]
Message-ID: <476DC278.7090309@garzik.org> (raw)
In-Reply-To: <1198337249.7741.52.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
Trond Myklebust wrote:
> On Fri, 2007-12-21 at 23:15 -0500, Jeff Garzik wrote:
>> While debugging my NFS server, I may have caught a Linux client bug.
>>
>> My server is currently buggy, in that, it never sets the
>> OPEN4_RESULT_CONFIRM bit after an OPEN with a new owner. Shockingly, I
>> can pass ~530 pynfs tests, fsx-linux [Linux v4 client], and build a
>> kernel [Linux v4 client] even with such brokenness. ;-)
>>
>> Anyway, the Linux NFSv4 client (2.6.24-rc6) seems quite happy with this
>> state of affairs, right until CLOSE time, when it passes "seqid + 2" to
>> my server rather than the expected "seqid + 1".
>>
>> Though I am quite happy that Linux managed to workaround my stupid
>> server and store data successfully _anyway_, I thought it was worth
>> commenting. I was assuming either
>>
>> a) Linux would notice the lack of OPEN4_RESULT_CONFIRM and
>> complain accordingly, or,
>>
>> b) Linux would generate a correct seqid, taking into account
>> the fact that it did not issue OPEN_CONFIRM.
>>
>> As you can see from the wireshark-0.99.7-2.fc8 binary dump at
>>
>> http://gtf.org/garzik/misc/dump.bz2 (33k compressed)
>>
>> we see many examples of
>>
>> C: OPEN (seqid == 0)
>> S: NFS4_OK
>>
>> C: [perhaps some intervening READ or WRITE or *ATTR]
>> S: [replies as expected]
>>
>> C: CLOSE (seqid == 2)
>> S: NFS4ERR_BAD_SEQID
>>
>> If you feel this behavior is fine given a broken server, that's cool...
>> I just figured I would post in case somebody cared about this data point.
>
> Hmm... That's not good. It is perfectly legal for a server to not
> request OPEN4_RESULT_CONFIRM (although it is probably not a very good
> idea), and the client should be able to cope with that.
If you want to reproduce, my server is open (though largely unannounced,
since its still in initial coding phase):
git://git.kernel.org/pub/scm/daemon/nfs/nfs4-ram.git
Commit b3f602203ab023aa559c4db5449448b9c7044f36 (HEAD~2 currently) can
reproduce the behavior nicely.
The server is currently a zero-configuration-file RAM server, so its
easy to test: just build and run (./nfs4_ramd). It binds to port 2049
with an empty filesystem, each time it is started. (--help for
alternate port or other options)
> I'll have a look at what is going on there.
>
>> P.S. I really really hate stateid/seqids at this point. RFC
>> nonwithstanding, they are basically undocumented. I am reduced to
>> poking through NFSv4 WG archives and Linux kernel code to find out what
>> my server should be doing. pynfs is no help here, either.
>
> The primary function of seqids is to allow the server to distinguish
> replayed non-idempotent RPC requests from new requests, so their
> properties are really quite simple:
>
> * If the seqid presented by the client is in sequence, then the
> server is supposed to handle the request.
> * If the seqid matches that of the last request, then the server
> is supposed to replay the reply.
> * If the seqid is completely out of sequence, then the server
> should return the BAD_SEQID error.
>
> As for stateids, their purpose is to allow the server to figure out to
> which client it is talking, and to track what state the client thinks it
> is holding. Apart from the seqid field (which is there in order to track
> the ordering of OPEN requests), a stateid is an opaque structure.
> The only really important requirement here is that you need to be able
> to distinguish stale state from valid state so that you can fence off
> RPC requests that refer to stale locks.
Yeah I figured out the purpose pretty quickly. The thing I missed was
that the seqid is per-lockowner, and not per-openfile. No surprise
things got weird, when I coded a server following that logic...
Plus there are a ton of undocumented -ordering- constraints you must
follow, with regards to validating seqid/stateid and then returning the
correct error.
Thanks for the response! Hope my buggy server helps you track down
client problems ;-)
Jeff
prev parent reply other threads:[~2007-12-23 2:05 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-22 4:15 Linux client misses lack of open-confirm? Jeff Garzik
2007-12-22 15:27 ` Trond Myklebust
[not found] ` <1198337249.7741.52.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2007-12-23 2:05 ` Jeff Garzik [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=476DC278.7090309@garzik.org \
--to=jeff@garzik.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox