From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7179520ADD4 for ; Mon, 2 Dec 2024 14:32:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733149930; cv=none; b=qQw4l+mjvTmetJi80x2csq4c6rIEWE6dsJOM1KmTwHNJl4Sv23F3I1lMp9dp2ipyerRT+6dNgSVOp/UvUPpm8vckjS+Qb1T0pQjg6t8pacgcI2gP/hMizgtrkNClKLYWdHUJKrpWgmOC6AMF+JPaaYg8gQMc+i0EKj8vN9exnSA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733149930; c=relaxed/simple; bh=Xs+zZ+14rHSUFjQox/+UleOe5OHlsnuVDiiYhDy7S8Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IDa7FJR8Sr2laIYQkFz7jbS5EpryXPDRrHyMDi5I++gKFuoCgKWnrYAm59SQ86Lr9hUpbP200mvTYbwYahRwTMp6CnYnivnnJqVwohruRaJjPwL/xP2Xv4ua5wJX6D6Qnv9eugCqnmyTXonYEBlUw9ubwmmqJHxmFKMmr9mdpK0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=DFmGAWNX; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DFmGAWNX" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733149927; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X7fFcHxYF8wqhD0Hc9GDfPSIG0dKVa9/JcHHLCXB4yE=; b=DFmGAWNXrpnSKmHXmYyDPR+clNNu9hEzm/Kl3YyQCY5Wq516c7RWnlEsMowLLwAxkUB2fY H/vYmY9kktH3+Rhi12NkhkZYfuJTAVUdWBP7iu5BRowd0LubnydjFCnq2NZ4IiwH/eiO3y 6yStggymHl1Oo4+KHWa9DA3I6LpENGA= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-331-uprFzPj1Pw-aGbWrKzRXbw-1; Mon, 02 Dec 2024 09:32:03 -0500 X-MC-Unique: uprFzPj1Pw-aGbWrKzRXbw-1 X-Mimecast-MFC-AGG-ID: uprFzPj1Pw-aGbWrKzRXbw Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CEA641953954; Mon, 2 Dec 2024 14:32:01 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.42.28.48]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4AFFB30000DF; Mon, 2 Dec 2024 14:31:59 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Yunsheng Lin , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 14/37] rxrpc: Only set DF=1 on initial DATA transmission Date: Mon, 2 Dec 2024 14:30:32 +0000 Message-ID: <20241202143057.378147-15-dhowells@redhat.com> In-Reply-To: <20241202143057.378147-1-dhowells@redhat.com> References: <20241202143057.378147-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Change how the DF flag is managed on DATA transmissions. Set it on initial transmission and don't set it on retransmissions. Then remove the handling for EMSGSIZE in rxrpc_send_data_packet() and just pretend it didn't happen, leaving it to the retransmission path to retry. The path-MTU discovery using PING ACKs is then used to probe for the maximum DATA size - though notification by ICMP will be used if one is received. Signed-off-by: David Howells cc: Marc Dionne cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- net/rxrpc/ar-internal.h | 1 + net/rxrpc/output.c | 32 ++++++++++++++++---------------- net/rxrpc/proc.c | 5 +++-- 3 files changed, 20 insertions(+), 18 deletions(-) diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index 55cc68dd1b40..84efa21f176c 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -98,6 +98,7 @@ struct rxrpc_net { atomic_t stat_tx_data_send; atomic_t stat_tx_data_send_frag; atomic_t stat_tx_data_send_fail; + atomic_t stat_tx_data_send_msgsize; atomic_t stat_tx_data_underflow; atomic_t stat_tx_data_cwnd_reset; atomic_t stat_rx_data; diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c index 56695c441514..95a3819dd85d 100644 --- a/net/rxrpc/output.c +++ b/net/rxrpc/output.c @@ -551,16 +551,11 @@ static int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *t msg.msg_controllen = 0; msg.msg_flags = MSG_SPLICE_PAGES; - /* Track what we've attempted to transmit at least once so that the - * retransmission algorithm doesn't try to resend what we haven't sent - * yet. + /* Send the packet with the don't fragment bit set unless we think it's + * too big or if this is a retransmission. */ - if (txb->seq == call->tx_transmitted + 1) - call->tx_transmitted = txb->seq + n - 1; - - /* send the packet with the don't fragment bit set if we currently - * think it's small enough */ - if (len >= sizeof(struct rxrpc_wire_header) + call->peer->max_data) { + if (txb->seq == call->tx_transmitted + 1 && + len >= sizeof(struct rxrpc_wire_header) + call->peer->max_data) { rxrpc_local_dont_fragment(conn->local, false); frag = rxrpc_tx_point_call_data_frag; } else { @@ -568,6 +563,13 @@ static int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *t frag = rxrpc_tx_point_call_data_nofrag; } + /* Track what we've attempted to transmit at least once so that the + * retransmission algorithm doesn't try to resend what we haven't sent + * yet. + */ + if (txb->seq == call->tx_transmitted + 1) + call->tx_transmitted = txb->seq + n - 1; + if (IS_ENABLED(CONFIG_AF_RXRPC_INJECT_LOSS)) { static int lose; if ((lose++ & 7) == 7) { @@ -578,7 +580,6 @@ static int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *t } } -retry: /* send the packet by UDP * - returns -EMSGSIZE if UDP would have to fragment the packet * to go out of the interface @@ -589,7 +590,11 @@ static int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *t ret = do_udp_sendmsg(conn->local->socket, &msg, len); conn->peer->last_tx_at = ktime_get_seconds(); - if (ret < 0) { + if (ret == -EMSGSIZE) { + rxrpc_inc_stat(call->rxnet, stat_tx_data_send_msgsize); + trace_rxrpc_tx_packet(call->debug_id, call->local->kvec[0].iov_base, frag); + ret = 0; + } else if (ret < 0) { rxrpc_inc_stat(call->rxnet, stat_tx_data_send_fail); trace_rxrpc_tx_fail(call->debug_id, txb->serial, ret, frag); } else { @@ -597,11 +602,6 @@ static int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *t } rxrpc_tx_backoff(call, ret); - if (ret == -EMSGSIZE && frag == rxrpc_tx_point_call_data_nofrag) { - rxrpc_local_dont_fragment(conn->local, false); - frag = rxrpc_tx_point_call_data_frag; - goto retry; - } done: if (ret >= 0) { diff --git a/net/rxrpc/proc.c b/net/rxrpc/proc.c index 1f1387cf62c8..aab392b4281f 100644 --- a/net/rxrpc/proc.c +++ b/net/rxrpc/proc.c @@ -476,10 +476,11 @@ int rxrpc_stats_show(struct seq_file *seq, void *v) struct rxrpc_net *rxnet = rxrpc_net(seq_file_single_net(seq)); seq_printf(seq, - "Data : send=%u sendf=%u fail=%u\n", + "Data : send=%u sendf=%u fail=%u emsz=%u\n", atomic_read(&rxnet->stat_tx_data_send), atomic_read(&rxnet->stat_tx_data_send_frag), - atomic_read(&rxnet->stat_tx_data_send_fail)); + atomic_read(&rxnet->stat_tx_data_send_fail), + atomic_read(&rxnet->stat_tx_data_send_msgsize)); seq_printf(seq, "Data-Tx : nr=%u retrans=%u uf=%u cwr=%u\n", atomic_read(&rxnet->stat_tx_data),