* [PATCH] Add more entropy to selection of port and xid for NFS client. [not found] <20080304163343.18442.patches@notabene> @ 2008-03-04 5:35 ` NeilBrown 2008-03-04 17:00 ` Chuck Lever 0 siblings, 1 reply; 3+ messages in thread From: NeilBrown @ 2008-03-04 5:35 UTC (permalink / raw) To: Trond Myklebust; +Cc: Chuck Lever, linux-nfs Hi Trond/Chunk I wonder what you think of this patch. I found we needed it for some customers who had machines with NFS root who (for reasons I never fully understood) find they reboot their machines fairly often (maybe it was a test setup or something). The often got "inode number mismatch" errors because the reply came out of the server's cache and was left over from before the client's reboot. Thanks, NeilBrown ### Comments for Changeset Diskless clients that use NFS-root have very little chance to gain much entropy in net_random before the NFS client must choose a port number and an 'xid' to being talking to the server. So add as much entropy as possible. Without this, clients can and do suffer from collisions in the servers reply cache, and get meaningless replies. Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./net/sunrpc/xprt.c | 5 ++++- ./net/sunrpc/xprtsock.c | 7 +++++-- 2 files changed, 9 insertions(+), 3 deletions(-) diff .prev/net/sunrpc/xprt.c ./net/sunrpc/xprt.c --- .prev/net/sunrpc/xprt.c 2008-03-04 16:33:26.000000000 +1100 +++ ./net/sunrpc/xprt.c 2008-03-04 16:33:36.000000000 +1100 @@ -922,7 +922,10 @@ static inline __be32 xprt_alloc_xid(stru static inline void xprt_init_xid(struct rpc_xprt *xprt) { - xprt->xid = net_random(); + get_random_bytes(&xprt->xid, sizeof(xprt->xid)); + /* And just in case random_state isn't initialised yet.. */ + xprt->xid ^= net_random(); + xprt->xid += (CURRENT_TIME.tv_sec << 10); } static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt) diff .prev/net/sunrpc/xprtsock.c ./net/sunrpc/xprtsock.c --- .prev/net/sunrpc/xprtsock.c 2008-03-04 16:33:26.000000000 +1100 +++ ./net/sunrpc/xprtsock.c 2008-03-04 16:28:06.000000000 +1100 @@ -1283,8 +1283,11 @@ static void xs_udp_timer(struct rpc_task static unsigned short xs_get_random_port(void) { unsigned short range = xprt_max_resvport - xprt_min_resvport; - unsigned short rand = (unsigned short) net_random() % range; - return rand + xprt_min_resvport; + unsigned short rand; + get_random_bytes(&rand, sizeof(rand)); + rand ^= (unsigned short)net_random(); + rand ^= (CURRENT_TIME.tv_nsec >> 10); + return (rand % range) + xprt_min_resvport; } /** ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Add more entropy to selection of port and xid for NFS client. 2008-03-04 5:35 ` [PATCH] Add more entropy to selection of port and xid for NFS client NeilBrown @ 2008-03-04 17:00 ` Chuck Lever 2008-03-06 2:36 ` Neil Brown 0 siblings, 1 reply; 3+ messages in thread From: Chuck Lever @ 2008-03-04 17:00 UTC (permalink / raw) To: NeilBrown; +Cc: Trond Myklebust, linux-nfs Hi Neil- On Mar 4, 2008, at 12:35 AM, NeilBrown wrote: > Hi Trond/Chunk > > I wonder what you think of this patch. > > I found we needed it for some customers who had machines with NFS root > who (for reasons I never fully understood) find they reboot their > machines fairly often (maybe it was a test setup or something). The > often got "inode number mismatch" errors because the reply came out of > the server's cache and was left over from before the client's reboot. If the problem is indeed lack of entropy at boot time, then random32 () should be fixed. It suggests that random32_init() isn't getting invoked soon enough during boot to have any effect on the starting XID selection. This seems to get broken periodically... perhaps random32() should warn if it detects that random32_init() hasn't been invoked yet. Another alternative: continue to use net_random(), but the RPC client could add entropy via net_srandom() calls during it's initialization. Unfortunately net_srandom() works only on the local CPU's entropy pool, so net_random() calls on other CPUs would still be a problem. More below... > ### Comments for Changeset > > Diskless clients that use NFS-root have very little chance > to gain much entropy in net_random before the NFS client must > choose a port number and an 'xid' to being talking to the server. > So add as much entropy as possible. > > Without this, clients can and do suffer from collisions in the servers > reply cache, and get meaningless replies. > > Signed-off-by: Neil Brown <neilb@suse.de> > > ### Diffstat output > ./net/sunrpc/xprt.c | 5 ++++- > ./net/sunrpc/xprtsock.c | 7 +++++-- > 2 files changed, 9 insertions(+), 3 deletions(-) > > diff .prev/net/sunrpc/xprt.c ./net/sunrpc/xprt.c > --- .prev/net/sunrpc/xprt.c 2008-03-04 16:33:26.000000000 +1100 > +++ ./net/sunrpc/xprt.c 2008-03-04 16:33:36.000000000 +1100 > @@ -922,7 +922,10 @@ static inline __be32 xprt_alloc_xid(stru > > static inline void xprt_init_xid(struct rpc_xprt *xprt) > { > - xprt->xid = net_random(); > + get_random_bytes(&xprt->xid, sizeof(xprt->xid)); In fact I think we used to use get_random_bytes() for the initial XID, and that wasn't initialized at boot time either. We hit exactly the same problem. > + /* And just in case random_state isn't initialised yet.. */ > + xprt->xid ^= net_random(); > + xprt->xid += (CURRENT_TIME.tv_sec << 10); > } I would use tv_nsec instead -- that will give a lot more randomness. I see that the port selection logic does use tv_nsec. Basically you are using get_random_bytes and net_random, both of which are known not to work at boot time. :-) > static void xprt_request_init(struct rpc_task *task, struct > rpc_xprt *xprt) > > diff .prev/net/sunrpc/xprtsock.c ./net/sunrpc/xprtsock.c > --- .prev/net/sunrpc/xprtsock.c 2008-03-04 16:33:26.000000000 +1100 > +++ ./net/sunrpc/xprtsock.c 2008-03-04 16:28:06.000000000 +1100 > @@ -1283,8 +1283,11 @@ static void xs_udp_timer(struct rpc_task > static unsigned short xs_get_random_port(void) > { > unsigned short range = xprt_max_resvport - xprt_min_resvport; > - unsigned short rand = (unsigned short) net_random() % range; > - return rand + xprt_min_resvport; > + unsigned short rand; > + get_random_bytes(&rand, sizeof(rand)); > + rand ^= (unsigned short)net_random(); > + rand ^= (CURRENT_TIME.tv_nsec >> 10); > + return (rand % range) + xprt_min_resvport; > } > > /** Maybe you should create a new shared function (perhaps in xprt.c) that does the mashing up of the current time and some random values. Port selection can take the middle 16 bits of a 32-bit random number. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Add more entropy to selection of port and xid for NFS client. 2008-03-04 17:00 ` Chuck Lever @ 2008-03-06 2:36 ` Neil Brown 0 siblings, 0 replies; 3+ messages in thread From: Neil Brown @ 2008-03-06 2:36 UTC (permalink / raw) To: Chuck Lever; +Cc: Trond Myklebust, linux-nfs On Tuesday March 4, chuck.lever@oracle.com wrote: > Hi Neil- > > On Mar 4, 2008, at 12:35 AM, NeilBrown wrote: > > Hi Trond/Chunk > > > > I wonder what you think of this patch. > > > > I found we needed it for some customers who had machines with NFS root > > who (for reasons I never fully understood) find they reboot their > > machines fairly often (maybe it was a test setup or something). The > > often got "inode number mismatch" errors because the reply came out of > > the server's cache and was left over from before the client's reboot. > > If the problem is indeed lack of entropy at boot time, then random32 > () should be fixed. It suggests that random32_init() isn't getting > invoked soon enough during boot to have any effect on the starting > XID selection. This seems to get broken periodically... perhaps > random32() should warn if it detects that random32_init() hasn't been > invoked yet. Ahhh..... This wasn't with the most recent kernel (at all) and I see that the net_random being used at the time is somewhat more simplistic than the random32 that is now being used. So maybe the problem has already been fixed at a deeper level. Certainly random32 is being initialised with a bit more entropy than net_random was at the time. My approach was: these numbers really really need to not repeat, so mix together as much potential entropy as possible, hoping that some of it really will be different each reboot. Maybe random32 is now sufficient in itself. > > Another alternative: continue to use net_random(), but the RPC client > could add entropy via net_srandom() calls during it's > initialization. There is nothing random in the MOUNT reply, is there? I guess if TCP were used, the servers sequence number would be random enough to remove and XID collision problems. But that is probably fairly hard to extract. I think I might just assume that the new net_random (aka random32) is good enough, and that the problem only exists in older kernels. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-03-06 2:36 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20080304163343.18442.patches@notabene>
2008-03-04 5:35 ` [PATCH] Add more entropy to selection of port and xid for NFS client NeilBrown
2008-03-04 17:00 ` Chuck Lever
2008-03-06 2:36 ` Neil Brown
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.