public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Tony Lu <tonylu@linux.alibaba.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Karsten Graul <kgraul@linux.ibm.com>,
	davem@davemloft.net, netdev@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-rdma@vger.kernel.org,
	jacob.qi@linux.alibaba.com, xuanzhuo@linux.alibaba.com,
	guwen@linux.alibaba.com, dust.li@linux.alibaba.com
Subject: Re: [PATCH net 1/4] Revert "net/smc: don't wait for send buffer space when data was already sent"
Date: Thu, 28 Oct 2021 14:48:28 +0800	[thread overview]
Message-ID: <YXpHvKGhPzcNoEtD@TonyMac-Alibaba> (raw)
In-Reply-To: <20211027084710.1f4a4ff1@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

On Wed, Oct 27, 2021 at 08:47:10AM -0700, Jakub Kicinski wrote:
> On Wed, 27 Oct 2021 17:38:27 +0200 Karsten Graul wrote:
> > What we found out was that applications called sendmsg() with large data
> > buffers using blocking sockets. This led to the described situation, were the
> > solution was to early return to user space even if not all data were sent yet.
> > Userspace applications should not have a problem with the fact that sendmsg()
> > returns a smaller byte count than requested.
> > 
> > Reverting this patch would bring back the stalled connection problem.
> 
> I'm not sure. The man page for send says:
> 
>        When the message does not fit into  the  send  buffer  of  the  socket,
>        send()  normally blocks, unless the socket has been placed in nonblock‐
>        ing I/O mode.  In nonblocking mode it would fail with the error  EAGAIN
>        or  EWOULDBLOCK in this case.
> 
> dunno if that's required by POSIX or just a best practice.

The man page describes the common cases about the socket API behavior,
but depends on the implement.

For example, the connect(2) implement of TCP, it would never block, but
also provides EAGAIN errors for UNIX domain sockets.

       EAGAIN For nonblocking UNIX domain sockets, the socket is
              nonblocking, and the connection cannot be completed
              immediately.  For other socket families, there are
              insufficient entries in the routing cache.

In my opinion, if we are going to replace TCP with SMC, these userspace
socket API should behavior as same, and don't break the userspace
applications like netperf. It could be better to block here when sending
message without enough buffer.

In our benchmarks and E2E tests (Redis, MySQL, etc.), it is acceptable to
block here. Because the userspace applications usually block in the loop
until the data send out. If it blocks, the scheduler can handle it.

  reply	other threads:[~2021-10-28  6:48 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-27  8:52 [PATCH net 0/4] Fixes for SMC Tony Lu
2021-10-27  8:52 ` [PATCH net 1/4] Revert "net/smc: don't wait for send buffer space when data was already sent" Tony Lu
2021-10-27 10:21   ` Karsten Graul
2021-10-27 15:08     ` Jakub Kicinski
2021-10-27 15:38       ` Karsten Graul
2021-10-27 15:47         ` Jakub Kicinski
2021-10-28  6:48           ` Tony Lu [this message]
2021-10-28 11:57           ` Karsten Graul
2021-10-28 14:38             ` Jakub Kicinski
2021-11-01  7:04               ` Tony Lu
2021-11-02  9:17                 ` Karsten Graul
2021-11-03  3:06                   ` Tony Lu
2021-11-06 12:46                     ` Karsten Graul
2021-10-27  8:52 ` [PATCH net 2/4] net/smc: Fix smc_link->llc_testlink_time overflow Tony Lu
2021-10-27 10:24   ` Karsten Graul
2021-10-28  6:52     ` Tony Lu
2021-10-27  8:52 ` [PATCH net 3/4] net/smc: Correct spelling mistake to TCPF_SYN_RECV Tony Lu
2021-10-27 10:23   ` Karsten Graul
2021-10-28  6:53     ` Tony Lu
2021-10-27  8:52 ` [PATCH net 4/4] net/smc: Fix wq mismatch issue caused by smc fallback Tony Lu
2021-10-28 14:16   ` Karsten Graul
2021-10-29  9:40   ` Karsten Graul
2021-11-01  6:15     ` Wen Gu
2021-11-02  9:25       ` Karsten Graul
2021-11-03  8:56         ` Wen Gu
2021-11-04  4:39         ` Wen Gu
2021-11-06 12:51           ` Karsten Graul
2021-11-10 12:50             ` Wen Gu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXpHvKGhPzcNoEtD@TonyMac-Alibaba \
    --to=tonylu@linux.alibaba.com \
    --cc=davem@davemloft.net \
    --cc=dust.li@linux.alibaba.com \
    --cc=guwen@linux.alibaba.com \
    --cc=jacob.qi@linux.alibaba.com \
    --cc=kgraul@linux.ibm.com \
    --cc=kuba@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox