From: Tony Lu <tonylu@linux.alibaba.com>
To: Stefan Raspl <raspl@linux.ibm.com>
Cc: kgraul@linux.ibm.com, kuba@kernel.org, davem@davemloft.net,
netdev@vger.kernel.org, linux-s390@vger.kernel.org
Subject: Re: [PATCH net-next 2/3] net/smc: Remove corked dealyed work
Date: Mon, 14 Feb 2022 20:10:17 +0800 [thread overview]
Message-ID: <YgpGqV11uW6RfSAt@TonyMac-Alibaba> (raw)
In-Reply-To: <f4166712-9a1e-51a0-409d-b7df25a66c52@linux.ibm.com>
On Mon, Feb 14, 2022 at 11:29:10AM +0100, Stefan Raspl wrote:
> On 2/11/22 10:10, Tony Lu wrote:
> > On Mon, Jan 31, 2022 at 08:40:47PM +0100, Stefan Raspl wrote:
> > > On 1/30/22 19:02, Tony Lu wrote:
> > > > Based on the manual of TCP_CORK [1] and MSG_MORE [2], these two options
> > > > have the same effect. Applications can set these options and informs the
> > > > kernel to pend the data, and send them out only when the socket or
> > > > syscall does not specify this flag. In other words, there's no need to
> > > > send data out by a delayed work, which will queue a lot of work.
> > > >
> > > > This removes corked delayed work with SMC_TX_CORK_DELAY (250ms), and the
> > > > applications control how/when to send them out. It improves the
> > > > performance for sendfile and throughput, and remove unnecessary race of
> > > > lock_sock(). This also unlocks the limitation of sndbuf, and try to fill
> > > > it up before sending.
> > > >
> > > > [1] https://linux.die.net/man/7/tcp
> > > > [2] https://man7.org/linux/man-pages/man2/send.2.html
> > > >
> > > > Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
> > > > ---
> > > > net/smc/smc_tx.c | 15 ++++++---------
> > > > 1 file changed, 6 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
> > > > index 7b0b6e24582f..9cec62cae7cb 100644
> > > > --- a/net/smc/smc_tx.c
> > > > +++ b/net/smc/smc_tx.c
> > > > @@ -31,7 +31,6 @@
> > > > #include "smc_tracepoint.h"
> > > > #define SMC_TX_WORK_DELAY 0
> > > > -#define SMC_TX_CORK_DELAY (HZ >> 2) /* 250 ms */
> > > > /***************************** sndbuf producer *******************************/
> > > > @@ -237,15 +236,13 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
> > > > if ((msg->msg_flags & MSG_OOB) && !send_remaining)
> > > > conn->urg_tx_pend = true;
> > > > if ((msg->msg_flags & MSG_MORE || smc_tx_is_corked(smc)) &&
> > > > - (atomic_read(&conn->sndbuf_space) >
> > > > - (conn->sndbuf_desc->len >> 1)))
> > > > - /* for a corked socket defer the RDMA writes if there
> > > > - * is still sufficient sndbuf_space available
> > > > + (atomic_read(&conn->sndbuf_space)))
> > > > + /* for a corked socket defer the RDMA writes if
> > > > + * sndbuf_space is still available. The applications
> > > > + * should known how/when to uncork it.
> > > > */
> > > > - queue_delayed_work(conn->lgr->tx_wq, &conn->tx_work,
> > > > - SMC_TX_CORK_DELAY);
> > > > - else
> > > > - smc_tx_sndbuf_nonempty(conn);
> > > > + continue;
> > >
> > > In case we just corked the final bytes in this call, wouldn't this
> > > 'continue' prevent us from accounting the Bytes that we just staged to be
> > > sent out later in the trace_smc_tx_sendmsg() call below?
> > >
> > > > + smc_tx_sndbuf_nonempty(conn);
> > > > trace_smc_tx_sendmsg(smc, copylen);
> > >
> >
> > If the application send out the final bytes in this call, the
> > application should also clear MSG_MORE or TCP_CORK flag, this action is
> > required based on the manuals [1] and [2]. So it is safe to cork the data
> > if flag is setted, and continue to the next loop until application
> > clears the flag.
>
> Yes, I understand. But trace_smc_tx_sendmsg(smc, copylen) should be called
> for each portion of data that we transmit, i.e. each time we run through
> this loop. That is because parameter copylen is reset during each iteration.
> Now your patch adds a 'continue', which prevents that trace_smc_tc... call
> from being made. Which means the information that 'copylen' Bytes were
> transferred is lost forever, and the accounting of tx Bytes is off by
> 'copylen' Bytes, I believe!
This makes sense to me. It shouldn't be ignored if data was corked. I
will fix it in the next patch.
Thank you,
Tony Lu
next prev parent reply other threads:[~2022-02-14 12:10 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-30 18:02 [PATCH net-next 0/3] net/smc: Improvements for TCP_CORK and sendfile() Tony Lu
2022-01-30 18:02 ` [PATCH net-next 1/3] net/smc: Send directly when TCP_CORK is cleared Tony Lu
2022-01-31 19:13 ` Stefan Raspl
2022-02-07 10:03 ` Tony Lu
2022-02-11 6:52 ` [PATCH net-next] net/smc: Add comment for smc_tx_pending Tony Lu
2022-02-14 11:20 ` patchwork-bot+netdevbpf
2022-01-30 18:02 ` [PATCH net-next 2/3] net/smc: Remove corked dealyed work Tony Lu
2022-01-31 19:40 ` Stefan Raspl
2022-02-11 9:10 ` Tony Lu
2022-02-14 10:29 ` Stefan Raspl
2022-02-14 12:10 ` Tony Lu [this message]
2022-01-30 18:02 ` [PATCH net-next 3/3] net/smc: Cork when sendpage with MSG_SENDPAGE_NOTLAST flag Tony Lu
2022-01-31 19:46 ` Stefan Raspl
2022-01-31 19:42 ` [PATCH net-next 0/3] net/smc: Improvements for TCP_CORK and sendfile() Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YgpGqV11uW6RfSAt@TonyMac-Alibaba \
--to=tonylu@linux.alibaba.com \
--cc=davem@davemloft.net \
--cc=kgraul@linux.ibm.com \
--cc=kuba@kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=raspl@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox