Re: [NFS] NFSD over TCP: TCP broken?

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [NFS] NFSD over TCP: TCP broken?
       [not found] <00f501c156f9$95337ef0$3291b40a@fserv2000.net>
@ 2001-10-17 12:38 ` Shirish Kalele
  2001-10-17 17:58   ` kuznet
  0 siblings, 1 reply; 5+ messages in thread
From: Shirish Kalele @ 2001-10-17 12:38 UTC (permalink / raw)
  To: nfs, linux-kernel

Okay, looking at tcp_sendmsg a little more, it looks like it lets go of the
sock lock in wait_for_tcp_memory before re-acquiring it, which is probably
where the interleaving gets in. I'm not sure if TCP should be handling this
or NFSD. From what little I know, TCP should serialize requests it gets and
atomically write them out, preventing interleaving, and it looks like it
doesn't do that.

- Shirish

----- Original Message -----
From: "Shirish Kalele" <kalele@veritas.com>
To: <kernel@vger.linux.org>; <nfs@lists.sourceforge.net>
Sent: Wednesday, October 17, 2001 3:50 AM
Subject: [NFS] NFSD over TCP: TCP broken?


> Hi,
>
> I've been looking at running nfsd over tcp on Linux. I modified the #ifdef
> so that nfsd uses tcp. I also made writes to the socket blocking, so that
> the thread blocks till the entire reply has been accepted by TCP. (I know
> the right way is going to be to have an independent thread whose job would
> be to just pick replies off a queue and block on sending them to tcp, but
> this is what I've done temporarily.)
>
> Then I tried to copy a directory from a Solaris client to the Linux server
> using nfsv3 over tcp. This took a long time, with lots of delays where
> nothing was being transferred.
>
> Looking at the network traces, it looks like the RPC records being sent
over
> TCP are inconsistent with the lengths specified in the record marker. This
> happens mainly when 3-4 requests arrive one after the other and you have
3-4
> threads replying to these requests in parallel. It looks like TCP gets
> hopelessly confused and botches up the replies being sent. I point my
finger
> at TCP because tcp_sendmsg returns a valid length indicating that the
entire
> reply was accepted, but the tcp sequence numbers show that the RPC record
> sent on the wire wasn't equal to the length accepted by TCP. After a
while,
> the client realizes it's out of sync when it gets an invalid RPC record
> marker, and resets and reconnects. This repeats multiple times.
>
> Is TCP known to break when multiple threads try to send data down the pipe
> simulaneously? Is there a known fix for this? Where should I be focussing
to
> fix the problem?
>
> I'm not on the list, so please include me in replies.
>
> Thanks,
> Shirish
>
>
>
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [NFS] NFSD over TCP: TCP broken?
  2001-10-17 12:38 ` [NFS] NFSD over TCP: TCP broken? Shirish Kalele
@ 2001-10-17 17:58   ` kuznet
  2001-10-17 18:38     ` Trond Myklebust
  0 siblings, 1 reply; 5+ messages in thread
From: kuznet @ 2001-10-17 17:58 UTC (permalink / raw)
  To: Shirish Kalele; +Cc: linux-kernel

Hello!

> where the interleaving gets in.

I do not think that you diagnosed the problem correctly.
nfsd used non blocking io and write to tcp is strictly atomic in this case.

>		 I'm not sure if TCP should be handling this
> or NFSD. From what little I know, TCP should serialize requests it gets and
> atomically write them out,

However, it does not and it should not. Like concurrent write()
to any other file, the result is unpredictably interleaved data.

Alexey

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [NFS] NFSD over TCP: TCP broken?
  2001-10-17 17:58   ` kuznet
@ 2001-10-17 18:38     ` Trond Myklebust
  0 siblings, 0 replies; 5+ messages in thread
From: Trond Myklebust @ 2001-10-17 18:38 UTC (permalink / raw)
  To: kuznet; +Cc: Shirish Kalele, linux-kernel

>>>>> " " == kuznet  <kuznet@ms2.inr.ac.ru> writes:

     > Hello!
    >> where the interleaving gets in.

     > I do not think that you diagnosed the problem correctly.  nfsd
     > used non blocking io and write to tcp is strictly atomic in
     > this case.

Some of the patches that attempted to fix the nfsd server code relied
on making the TCP stuff blocking. I've seen several such patches
floating around that ignore the fact that the socket lock is dropped
when the IPV4 socket code sleeps.

In any case, even with nonblocking TCP, one has to protect the socket
until the entire message has been sent. Otherwise we risk seeing
another thread racing for the socket while we're doing whatever needs
to be done to clear the -EAGAIN.

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [NFS] NFSD over TCP: TCP broken?
       [not found] <005a01c1579d$adab2100$3291b40a@fserv2000.net>
@ 2001-10-17 18:39 ` kuznet
  0 siblings, 0 replies; 5+ messages in thread
From: kuznet @ 2001-10-17 18:39 UTC (permalink / raw)
  To: Shirish Kalele; +Cc: linux-kernel, tamir, paulp

Hello!

> I'm making nfsd do blocking writes.

I see. Well, then you should make this right. :-)


> send(3N) manpage on Linux also says messages should be sent atomically.

Sorry? Please, cite, I cannot find this. send() behaviour used to be
pretty different. :-)

Alexey

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [NFS] NFSD over TCP: TCP broken?
       [not found] <00b401c157a1$8edd3f20$3291b40a@fserv2000.net>
@ 2001-10-17 19:27 ` kuznet
  0 siblings, 0 replies; 5+ messages in thread
From: kuznet @ 2001-10-17 19:27 UTC (permalink / raw)
  To: Shirish Kalele; +Cc: linux-kernel, tamir, paulp

Hello!

>        through the underlying protocol,  the  error  EMSGSIZE  is
>        returned, and the message is not transmitted.

It is about datagram sockets, stream sockets never return EMSGSIZE,
because have no messages boundaries.

Alexey

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-10-17 19:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <00f501c156f9$95337ef0$3291b40a@fserv2000.net>
2001-10-17 12:38 ` [NFS] NFSD over TCP: TCP broken? Shirish Kalele
2001-10-17 17:58   ` kuznet
2001-10-17 18:38     ` Trond Myklebust
     [not found] <005a01c1579d$adab2100$3291b40a@fserv2000.net>
2001-10-17 18:39 ` kuznet
     [not found] <00b401c157a1$8edd3f20$3291b40a@fserv2000.net>
2001-10-17 19:27 ` kuznet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox