From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 54CB738836F; Thu, 28 May 2026 08:48:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.133 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779958111; cv=none; b=SAjW9fyDxDSly3ws5RNERK4VhTbAAQyStZfByTP62g9Qg7BcEyj2PRL3oN5iR8qVprnpr/W/Axu1odINbutLlB7hVNcKa7XUInrT5cp81Ana432E2Qy1e3pI8yaJQ5iTUhMVQIGlIIgKNIYyAuQH0fL2LCtlknVlv/eEGaPSgvA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779958111; c=relaxed/simple; bh=/PV1feKwDb0qcYPzsYNIp41CTdC7U6Oyx6mx/p9sOcA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pCKB/YLyaOvdhlgf5k6zlAMFeUm4m1zQ3ddMGY7WgPA/bFCq5Tao/P4+XS25FFgdeCAMH5pmvvPQYXrrT8tKJ1YuyWrATrKGLvO0PCI8PqDmuNAymKfWbLI/erVabMGuth1/qSLZzK76LjgoFCYLXkgLHVEMDVmc+nrHtsW6i6o= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=vbxup1sI; arc=none smtp.client-ip=115.124.30.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="vbxup1sI" DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1779958106; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=6Cxnw59TSqv/EHwbhMjBJkoI3ie4xRbX7WmD/oO9zCc=; b=vbxup1sIsOqPmNje0MGcTleT1F2i2Y9u9ECysCtlP1FQQwRlIQrzY+CSrDhNzbBDIEcUM+Bz8vj2NKOICSiHkRsAx4egScMvDPRdSCd3L6Hy+WTrnHl3LNhBKDPWHVPxLRB1gDdpDiuhmBDuPEVIz1ex51QVnZm0peRpc9Gs6o4= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R291e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037026112;MF=alibuda@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0X3lvM2S_1779958105; Received: from j66a10360.sqa.eu95.tbsite.net(mailfrom:alibuda@linux.alibaba.com fp:SMTPD_---0X3lvM2S_1779958105 cluster:ay36) by smtp.aliyun-inc.com; Thu, 28 May 2026 16:48:25 +0800 From: "D. Wythe" To: "David S. Miller" , Dust Li , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Sidraya Jayagond , Wenjia Zhang Cc: Mahanta Jambigi , Simon Horman , Tony Lu , Wen Gu , linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, linux-s390@vger.kernel.org, netdev@vger.kernel.org, oliver.yang@linux.alibaba.com, pasic@linux.ibm.com Subject: [PATCH net-next v2 2/2] net/smc: reduce TX slot contention with exclusive wait Date: Thu, 28 May 2026 16:48:19 +0800 Message-ID: <20260528084819.6059-3-alibuda@linux.alibaba.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20260528084819.6059-1-alibuda@linux.alibaba.com> References: <20260528084819.6059-1-alibuda@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit smc_wr_tx_get_free_slot() waits for a free TX slot with wait_event_interruptible_timeout(). Since the wait_event family enqueues waiters as non-exclusive, wake_up() may wake multiple waiters even though only one can use the slot, causing thundering-herd contention when slots are scarce. Use an exclusive wait loop with prepare_to_wait_exclusive() so wake_up() wakes only one waiter per freed slot. smc_wr_wakeup_tx_wait() still uses wake_up_all() during link teardown, so teardown behavior is unchanged. Performance measured with netperf TCP_RR (63 flows, 200B write / 1000B read, 60s duration): +-------------------------------+---------------+---------------+ | smcr_max_conns_per_lgr | 32 | 255 | |-------------------------------+---------------+---------------| | before | 4.85 Gb/s | 657.95 Mb/s | |-------------------------------+---------------+---------------| | after | 5.01 Gb/s | 2.2 Gb/s | +-------------------------------+---------------+---------------+ Signed-off-by: D. Wythe --- net/smc/smc_wr.c | 36 ++++++++++++++++++++++++++---------- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/net/smc/smc_wr.c b/net/smc/smc_wr.c index 130bc6c26fb3..3cb47f77130e 100644 --- a/net/smc/smc_wr.c +++ b/net/smc/smc_wr.c @@ -153,9 +153,11 @@ int smc_wr_tx_get_free_slot(struct smc_link *link, struct smc_rdma_wr **wr_rdma_buf, struct smc_wr_tx_pend_priv **wr_pend_priv) { + unsigned long timeout = SMC_WR_TX_WAIT_FREE_SLOT_TIME; struct smc_link_group *lgr = smc_get_lgr(link); struct smc_wr_tx_pend *wr_pend; u32 idx = link->wr_tx_cnt; + DEFINE_WAIT(wait); int rc; *wr_buf = NULL; @@ -165,17 +167,31 @@ int smc_wr_tx_get_free_slot(struct smc_link *link, if (rc) return rc; } else { - rc = wait_event_interruptible_timeout( - link->wr_tx_wait, - !smc_link_sendable(link) || - lgr->terminating || - (smc_wr_tx_get_free_slot_index(link, &idx) != -EBUSY), - SMC_WR_TX_WAIT_FREE_SLOT_TIME); - if (!rc) { - /* timeout - terminate link */ - smcr_link_down_cond_sched(link); - return -EPIPE; + rc = 0; + for (;;) { + prepare_to_wait_exclusive(&link->wr_tx_wait, &wait, + TASK_INTERRUPTIBLE); + if (!smc_link_sendable(link) || lgr->terminating || + smc_wr_tx_get_free_slot_index(link, &idx) != -EBUSY) + break; + timeout = schedule_timeout(timeout); + /* re-check */ + if (!smc_link_sendable(link) || lgr->terminating || + smc_wr_tx_get_free_slot_index(link, &idx) != -EBUSY) + break; + if (!timeout) { + /* timeout - terminate link */ + smcr_link_down_cond_sched(link); + break; + } + if (signal_pending(current)) { + rc = -ERESTARTSYS; + break; + } } + finish_wait(&link->wr_tx_wait, &wait); + if (rc) + return rc; if (idx == link->wr_tx_cnt) return -EPIPE; } -- 2.45.0