From: Eric Dumazet <edumazet@google.com>
To: Xin Long <lucien.xin@gmail.com>
Cc: "David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, eric.dumazet@gmail.com,
Jacob Moroni <jmoroni@google.com>,
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Subject: Re: [PATCH net-next] sctp: fix busy polling
Date: Wed, 3 Jan 2024 17:06:22 +0100 [thread overview]
Message-ID: <CANn89iKtwvm-32HYBr7ynV3d_TUV2DFGyPyZMbpYYzE_kkwwQA@mail.gmail.com> (raw)
In-Reply-To: <CADvbK_fT9-ufQZ1wAh+Zs-0eKJYY__9JhQX_qBo-Fxei5yXnFA@mail.gmail.com>
On Wed, Jan 3, 2024 at 4:14 PM Xin Long <lucien.xin@gmail.com> wrote:
>
> On Wed, Jan 3, 2024 at 5:51 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Fri, Dec 22, 2023 at 7:34 PM Xin Long <lucien.xin@gmail.com> wrote:
> > >
> > > On Fri, Dec 22, 2023 at 12:05 PM Eric Dumazet <edumazet@google.com> wrote:
> > > >
> > > > On Fri, Dec 22, 2023 at 5:08 PM Xin Long <lucien.xin@gmail.com> wrote:
> > > > >
> > > > > On Tue, Dec 19, 2023 at 12:00 PM Eric Dumazet <edumazet@google.com> wrote:
> > > > > >
> > > > > > Busy polling while holding the socket lock makes litle sense,
> > > > > > because incoming packets wont reach our receive queue.
> > > > > >
> > > > > > Fixes: 8465a5fcd1ce ("sctp: add support for busy polling to sctp protocol")
> > > > > > Reported-by: Jacob Moroni <jmoroni@google.com>
> > > > > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > > > > Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> > > > > > Cc: Xin Long <lucien.xin@gmail.com>
> > > > > > ---
> > > > > > net/sctp/socket.c | 10 ++++------
> > > > > > 1 file changed, 4 insertions(+), 6 deletions(-)
> > > > > >
> > > > > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > > > > > index 5fb02bbb4b349ef9ab9c2790cccb30fb4c4e897c..6b9fcdb0952a0fe599ae5d1d1cc6fa9557a3a3bc 100644
> > > > > > --- a/net/sctp/socket.c
> > > > > > +++ b/net/sctp/socket.c
> > > > > > @@ -2102,6 +2102,10 @@ static int sctp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
> > > > > > if (unlikely(flags & MSG_ERRQUEUE))
> > > > > > return inet_recv_error(sk, msg, len, addr_len);
> > > > > >
> > > > > > + if (sk_can_busy_loop(sk) &&
> > > > > > + skb_queue_empty_lockless(&sk->sk_receive_queue))
> > > > > > + sk_busy_loop(sk, flags & MSG_DONTWAIT);
> > > > > > +
> > > > > Here is no any sk_state check, if the SCTP socket(TCP type) has been
> > > > > already closed by peer, will sctp_recvmsg() block here?
> > > >
> > > > Busy polling is only polling the NIC queue, hoping to feed this socket
> > > > for incoming packets.
> > > OK, will it block if there's no incoming packets on the NIC queue?
> > >
> > > If yes, when sysctl net.core.busy_read=1, my concern is:
> > >
> > > client server
> > > -------------------------------
> > > listen()
> > > connect()
> > > accept()
> > > close()
> > > recvmsg() <----
> > >
> > > recvmsg() is supposed to return right away as the connection is
> > > already close(). With this patch, will recvmsg() be able to do
> > > that if no more incoming packets in the NIC after close()?
> >
> >
> > Answer is yes for a variety of reasons :
> >
> > net.core.busy_read=1 means :
> >
> > Busy poll will happen for
> > 1) at most one usec, and
> I see, never used busy polling, but what if the value is set to a large value,
> like minutes, I might be just overthinking and no one will do this?
>
No problem, you can look at
https://netdevconf.info/2.1/papers/BusyPollingNextGen.pdf for
a short introduction.
<quote>
Suggested settings are in the 50 to 100 us range
</quote>
> > 2) as long as there is no packet in sk->sk_receive_queue (see
> > sk_busy_loop_end())
> It's likely after being closed by peer, no packet at sk_receive_queue.
>
> >
> > But busy poll is only started on sockets that had established packets.
> I think it won't be told to break when the socket is closed by peer.
This is fine really.
sk_busy_loop_end() works fine as is for UDP/TCP sockets, and it does
not look at sk_state.
Keep in mind polling applications are using recvmsg() 20,000 times per second,
there is no point trying to optimize the last call.
>
> >
> > A listener will not engage this because sk->sk_napi_id does not
> > contain a valid NAPI ID.
> >
> >
> >
> > >
> > > Thanks.
> > >
> > > >
> > > > Using more than a lockless read of sk->sk_receive_queue is not really necessary,
> > > > and racy anyway.
> > > >
> > > > Eliezer Tamir added a check against sk_state for no good reason in
> > > > TCP, my plan is to remove it.
> > > >
> > > > There are other states where it still makes sense to allow busy polling.
> > > >
> > > >
> > > > >
> > > > > Maybe here it needs a `!(sk->sk_shutdown & RCV_SHUTDOWN)` check,
> > > > > which is set when it's closed by the peer.
> > > >
> > > > See above. Keep this as simple as possible...
> > > >
> > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > > lock_sock(sk);
> > > > > >
> > > > > > if (sctp_style(sk, TCP) && !sctp_sstate(sk, ESTABLISHED) &&
> > > > > > @@ -9046,12 +9050,6 @@ struct sk_buff *sctp_skb_recv_datagram(struct sock *sk, int flags, int *err)
> > > > > > if (sk->sk_shutdown & RCV_SHUTDOWN)
> > > > > > break;
> > > > > >
> > > > > > - if (sk_can_busy_loop(sk)) {
> > > > > > - sk_busy_loop(sk, flags & MSG_DONTWAIT);
> > > > > > -
> > > > > > - if (!skb_queue_empty_lockless(&sk->sk_receive_queue))
> > > > > > - continue;
> > > > > > - }
> > > > > >
> > > > > > /* User doesn't want to wait. */
> > > > > > error = -EAGAIN;
> > > > > > --
> > > > > > 2.43.0.472.g3155946c3a-goog
> > > > > >
next prev parent reply other threads:[~2024-01-03 16:06 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-19 17:00 [PATCH net-next] sctp: fix busy polling Eric Dumazet
2023-12-22 16:08 ` Xin Long
2023-12-22 17:05 ` Eric Dumazet
2023-12-22 18:34 ` Xin Long
2024-01-03 10:51 ` Eric Dumazet
2024-01-03 15:14 ` Xin Long
2024-01-03 16:06 ` Eric Dumazet [this message]
2024-01-04 10:30 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CANn89iKtwvm-32HYBr7ynV3d_TUV2DFGyPyZMbpYYzE_kkwwQA@mail.gmail.com \
--to=edumazet@google.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=jmoroni@google.com \
--cc=kuba@kernel.org \
--cc=lucien.xin@gmail.com \
--cc=marcelo.leitner@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).