From: Dust Li <dust.li@linux.alibaba.com>
To: Wenjia Zhang <wenjia@linux.ibm.com>,
Wen Gu <guwen@linux.alibaba.com>,
"D. Wythe" <alibuda@linux.alibaba.com>,
Tony Lu <tonylu@linux.alibaba.com>,
David Miller <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org, linux-s390@vger.kernel.org,
Heiko Carstens <hca@linux.ibm.com>,
Jan Karcher <jaka@linux.ibm.com>,
Gerd Bayer <gbayer@linux.ibm.com>,
Alexandra Winter <wintera@linux.ibm.com>,
Halil Pasic <pasic@linux.ibm.com>,
Nils Hoppmann <niho@linux.ibm.com>,
Niklas Schnell <schnelle@linux.ibm.com>,
Thorsten Winkler <twinkler@linux.ibm.com>,
Karsten Graul <kgraul@linux.ibm.com>,
Stefan Raspl <raspl@linux.ibm.com>
Subject: Re: [PATCH net-next] net/smc: increase SMC_WR_BUF_CNT
Date: Sat, 26 Oct 2024 07:58:39 +0800 [thread overview]
Message-ID: <20241025235839.GD36583@linux.alibaba.com> (raw)
In-Reply-To: <20241025074619.59864-1-wenjia@linux.ibm.com>
On 2024-10-25 09:46:19, Wenjia Zhang wrote:
>From: Halil Pasic <pasic@linux.ibm.com>
>
>The current value of SMC_WR_BUF_CNT is 16 which leads to heavy
>contention on the wr_tx_wait workqueue of the SMC-R linkgroup and its
>spinlock when many connections are competing for the buffer. Currently
>up to 256 connections per linkgroup are supported.
>
>To make things worse when finally a buffer becomes available and
>smc_wr_tx_put_slot() signals the linkgroup's wr_tx_wait wq, because
>WQ_FLAG_EXCLUSIVE is not used all the waiters get woken up, most of the
>time a single one can proceed, and the rest is contending on the
>spinlock of the wq to go to sleep again.
>
>For some reason include/linux/wait.h does not offer a top level wrapper
>macro for wait_event with interruptible, exclusive and timeout. I did
>not spend too many cycles on thinking if that is even a combination that
>makes sense (on the quick I don't see why not) and conversely I
>refrained from making an attempt to accomplish the interruptible,
>exclusive and timeout combo by using the abstraction-wise lower
>level __wait_event interface.
>
>To alleviate the tx performance bottleneck and the CPU overhead due to
>the spinlock contention, let us increase SMC_WR_BUF_CNT to 256.
Hi,
Have you tested other values, such as 64? In our internal version, we
have used 64 for some time.
Increasing this to 256 will require a 36K continuous physical memory
allocation in smc_wr_alloc_link_mem(). Based on my experience, this may
fail on servers that have been running for a long time and have
fragmented memory.
link->wr_rx_bufs = kcalloc(SMC_WR_BUF_CNT * 3, SMC_WR_BUF_SIZE,
GFP_KERNEL);
As we can see, the link->wr_rx_bufs will increase from 16*3*48 = 2,304
to 256*3*48=36,864 (1 page to 9 pages).
Best regards,
Dust
>
>Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>Reported-by: Nils Hoppmann <niho@linux.ibm.com>
>Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
>Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
>---
> net/smc/smc_wr.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/net/smc/smc_wr.h b/net/smc/smc_wr.h
>index f3008dda222a..81e772e241f3 100644
>--- a/net/smc/smc_wr.h
>+++ b/net/smc/smc_wr.h
>@@ -19,7 +19,7 @@
> #include "smc.h"
> #include "smc_core.h"
>
>-#define SMC_WR_BUF_CNT 16 /* # of ctrl buffers per link */
>+#define SMC_WR_BUF_CNT 256 /* # of ctrl buffers per link */
>
> #define SMC_WR_TX_WAIT_FREE_SLOT_TIME (10 * HZ)
>
>--
>2.43.0
>
next prev parent reply other threads:[~2024-10-25 23:58 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-25 7:46 [PATCH net-next] net/smc: increase SMC_WR_BUF_CNT Wenjia Zhang
2024-10-25 13:56 ` Simon Horman
2024-10-25 23:58 ` Dust Li [this message]
2024-10-31 12:30 ` Halil Pasic
2024-11-04 16:42 ` Halil Pasic
2024-11-05 10:16 ` Paolo Abeni
2024-11-05 14:34 ` Dust Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241025235839.GD36583@linux.alibaba.com \
--to=dust.li@linux.alibaba.com \
--cc=alibuda@linux.alibaba.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gbayer@linux.ibm.com \
--cc=guwen@linux.alibaba.com \
--cc=hca@linux.ibm.com \
--cc=jaka@linux.ibm.com \
--cc=kgraul@linux.ibm.com \
--cc=kuba@kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=niho@linux.ibm.com \
--cc=pabeni@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=raspl@linux.ibm.com \
--cc=schnelle@linux.ibm.com \
--cc=tonylu@linux.alibaba.com \
--cc=twinkler@linux.ibm.com \
--cc=wenjia@linux.ibm.com \
--cc=wintera@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).