public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Tony Lu <tonylu@linux.alibaba.com>
To: Karsten Graul <kgraul@linux.ibm.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
	davem@davemloft.net, netdev@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-rdma@vger.kernel.org,
	jacob.qi@linux.alibaba.com, xuanzhuo@linux.alibaba.com,
	guwen@linux.alibaba.com, dust.li@linux.alibaba.com
Subject: Re: [PATCH net 1/4] Revert "net/smc: don't wait for send buffer space when data was already sent"
Date: Wed, 3 Nov 2021 11:06:06 +0800	[thread overview]
Message-ID: <YYH8npT0+ww57Gg1@TonyMac-Alibaba> (raw)
In-Reply-To: <ca2a567b-915e-c4e1-96cf-2c03ff74aad5@linux.ibm.com>

On Tue, Nov 02, 2021 at 10:17:15AM +0100, Karsten Graul wrote:
> On 01/11/2021 08:04, Tony Lu wrote:
> > On Thu, Oct 28, 2021 at 07:38:27AM -0700, Jakub Kicinski wrote:
> >> On Thu, 28 Oct 2021 13:57:55 +0200 Karsten Graul wrote:
> >>> So how to deal with all of this? Is it an accepted programming error
> >>> when a user space program gets itself into this kind of situation?
> >>> Since this problem depends on internal send/recv buffer sizes such a
> >>> program might work on one system but not on other systems.
> >>
> >> It's a gray area so unless someone else has a strong opinion we can
> >> leave it as is.
> > 
> > Things might be different. IMHO, the key point of this problem is to
> > implement the "standard" POSIX socket API, or TCP-socket compatible API.
> > 
> >>> At the end the question might be if either such kind of a 'deadlock'
> >>> is acceptable, or if it is okay to have send() return lesser bytes
> >>> than requested.
> >>
> >> Yeah.. the thing is we have better APIs for applications to ask not to
> >> block than we do for applications to block. If someone really wants to
> >> wait for all data to come out for performance reasons they will
> >> struggle to get that behavior. 
> > 
> > IMO, it is better to do something to unify this behavior. Some
> > applications like netperf would be broken, and the people who want to use
> > SMC to run basic benchmark, would be confused about this, and its
> > compatibility with TCP. Maybe we could:
> > 1) correct the behavior of netperf to check the rc as we discussed.
> > 2) "copy" the behavior of TCP, and try to compatiable with TCP, though
> > it is a gray area.
> 
> I have a strong opinion here, so when the question is if the user either
> encounters a deadlock or if send() returns lesser bytes than requested,
> I prefer the latter behavior.
> The second case is much easier to debug for users, they can do something
> to handle the problem (loop around send()), and this case can even be detected
> using strace. But the deadlock case is nearly not debuggable by users and there
> is nothing to prevent it when the workload pattern runs into this situation
> (except to not use blocking sends).

I agree with you. I am curious about this deadlock scene. If it was
convenient, could you provide a reproducible test case? We are also
setting up a SMC CI/CD system to find the compatible and performance
fallback problems. Maybe we could do something to make it better.

Cheers,
Tony Lu

  reply	other threads:[~2021-11-03  3:06 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-27  8:52 [PATCH net 0/4] Fixes for SMC Tony Lu
2021-10-27  8:52 ` [PATCH net 1/4] Revert "net/smc: don't wait for send buffer space when data was already sent" Tony Lu
2021-10-27 10:21   ` Karsten Graul
2021-10-27 15:08     ` Jakub Kicinski
2021-10-27 15:38       ` Karsten Graul
2021-10-27 15:47         ` Jakub Kicinski
2021-10-28  6:48           ` Tony Lu
2021-10-28 11:57           ` Karsten Graul
2021-10-28 14:38             ` Jakub Kicinski
2021-11-01  7:04               ` Tony Lu
2021-11-02  9:17                 ` Karsten Graul
2021-11-03  3:06                   ` Tony Lu [this message]
2021-11-06 12:46                     ` Karsten Graul
2021-10-27  8:52 ` [PATCH net 2/4] net/smc: Fix smc_link->llc_testlink_time overflow Tony Lu
2021-10-27 10:24   ` Karsten Graul
2021-10-28  6:52     ` Tony Lu
2021-10-27  8:52 ` [PATCH net 3/4] net/smc: Correct spelling mistake to TCPF_SYN_RECV Tony Lu
2021-10-27 10:23   ` Karsten Graul
2021-10-28  6:53     ` Tony Lu
2021-10-27  8:52 ` [PATCH net 4/4] net/smc: Fix wq mismatch issue caused by smc fallback Tony Lu
2021-10-28 14:16   ` Karsten Graul
2021-10-29  9:40   ` Karsten Graul
2021-11-01  6:15     ` Wen Gu
2021-11-02  9:25       ` Karsten Graul
2021-11-03  8:56         ` Wen Gu
2021-11-04  4:39         ` Wen Gu
2021-11-06 12:51           ` Karsten Graul
2021-11-10 12:50             ` Wen Gu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YYH8npT0+ww57Gg1@TonyMac-Alibaba \
    --to=tonylu@linux.alibaba.com \
    --cc=davem@davemloft.net \
    --cc=dust.li@linux.alibaba.com \
    --cc=guwen@linux.alibaba.com \
    --cc=jacob.qi@linux.alibaba.com \
    --cc=kgraul@linux.ibm.com \
    --cc=kuba@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox