From: NeilBrown <neilb@suse.com>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH] SUNRPC: handle -EAGAIN from socket writes better.
Date: Mon, 6 Jul 2015 11:26:47 +1000 [thread overview]
Message-ID: <20150706112647.2d4c3780@noble> (raw)
In-Reply-To: <1435931379-6654-1-git-send-email-trond.myklebust@primarydata.com>
On Fri, 3 Jul 2015 09:49:37 -0400 Trond Myklebust
<trond.myklebust@primarydata.com> wrote:
> Hi Neil,
>
> There should already be a handler for ENOBUFS in call_status, but
> I can see that it has a couple of flaws. What say we try to fix that
> instead?
>
> Cheers
> Trond
>
Hi Trond,
your patches make sense I think, but they are only part of a solution.
In the problem case the error comes from sk_stream_wait_memory, and
that returns EAGAIN, never ENOBUFS. So fixing the handling of ENOBUFS
won't be enough.
The call path is
xs_tcp_send_request -> xs_sendpages -> xs_send_pagedata ->
(sock->ops->sendpage == tcp_sendpage) -> do_tcp_sendpage ->
sk_stream_wait_memory
and EAGAIN travels all the way from bottom to top unmolested.
We could possibly change sk_stream_wait_memory to return ENOBUFS if
vm_wait is != 0, but:
- that could have other consequences so needs to go through netdev
and probably isn't a quick fix
- there could be other paths that don't return ENOBUFS - it really
don't seem that ENOBUFS appears all that much in 'net/' in places
where it would need to...
Maybe we could check and translate the error in xs_sendpages:
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 66891e32c5e3..8474d79ec2b2 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -431,6 +431,14 @@ out:
if (err > 0) {
*sent_p += err;
err = 0;
+ } else if (err == -EAGAIN) {
+ /* Might be wrong error code. */
+ if (sock->sk->sk_write_space == xs_tcp_write_space &&
+ sk_stream_is_writeable(sock->sk))
+ err = -ENOBUFS;
+ if (sock->sk->sk_write_space == xs_udp_write_space &&
+ sock_writeable(sock->sk))
+ err = -ENOBUFS;
}
return err;
}
Though that is a bit of a hack. If/when net-dev gets the correct error
returns, we can then remove that.
Though I'm beginning to wonder if ENOBUFS is the correct error code
anyway. "man 2 send" suggests ENOMEM, with ENOBUFS meaning:
The output queue for a network interface was full. This
generally indicates that the interface has stopped send-
ing, but may be caused by transient congestion. (Nor-
mally, this does not occur in Linux. Packets are just
silently dropped when a device queue overflows.)
So I'm not sure I feel to comfortable about relying on the exact error
code.
What do you think?
Thanks,
NeilBrown
next prev parent reply other threads:[~2015-07-06 1:26 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-29 4:26 [PATCH] SUNRPC: handle -EAGAIN from socket writes better NeilBrown
2015-07-03 13:49 ` Trond Myklebust
2015-07-03 13:49 ` [PATCH 1/2] SUNRPC: Don't reencode message if transmission failed with ENOBUFS Trond Myklebust
2015-07-03 13:49 ` [PATCH 2/2] SUNRPC: Don't confuse ENOBUFS with a write_space issue Trond Myklebust
2015-07-06 1:26 ` NeilBrown [this message]
2015-07-06 13:36 ` [PATCH] SUNRPC: handle -EAGAIN from socket writes better Trond Myklebust
2015-07-07 21:01 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150706112647.2d4c3780@noble \
--to=neilb@suse.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox