public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jon Maxwell <jmaxwell37@gmail.com>
To: davem@davemloft.net
Cc: edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	yoshfuji@linux-ipv6.org, dsahern@kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Jon Maxwell <jmaxwell37@gmail.com>
Subject: [net-next] ipv6: fix routing cache overflow for raw sockets
Date: Mon, 19 Dec 2022 10:48:01 +1100	[thread overview]
Message-ID: <20221218234801.579114-1-jmaxwell37@gmail.com> (raw)

Sending Ipv6 packets in a loop via a raw socket triggers an issue where a 
route is cloned by ip6_rt_cache_alloc() for each packet sent. This quickly 
consumes the Ipv6 max_size threshold which defaults to 4096 resulting in 
these warnings:

[1]   99.187805] dst_alloc: 7728 callbacks suppressed
[2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
.
.
[300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.

When this happens the packet is dropped and sendto() gets a network is 
unreachable error:

# ./a.out -s 

remaining pkt 200557 errno 101
remaining pkt 196462 errno 101
.
.
remaining pkt 126821 errno 101

Fix this by adding a flag to prevent the cloning of routes for raw sockets. 
Which makes the Ipv6 routing code use per-cpu routes instead which prevents 
packet drop due to max_size overflow. 

Ipv4 is not affected because it has a very large default max_size.

Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
---
 include/net/flow.h | 1 +
 net/ipv6/raw.c     | 2 +-
 net/ipv6/route.c   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/net/flow.h b/include/net/flow.h
index 2f0da4f0318b..30b8973ffb4b 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -37,6 +37,7 @@ struct flowi_common {
 	__u8	flowic_flags;
 #define FLOWI_FLAG_ANYSRC		0x01
 #define FLOWI_FLAG_KNOWN_NH		0x02
+#define FLOWI_FLAG_SKIP_RAW		0x04
 	__u32	flowic_secid;
 	kuid_t  flowic_uid;
 	struct flowi_tunnel flowic_tun_key;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index a06a9f847db5..0b89a7e66d09 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -884,7 +884,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6));
 
 	if (hdrincl)
-		fl6.flowi6_flags |= FLOWI_FLAG_KNOWN_NH;
+		fl6.flowi6_flags |= FLOWI_FLAG_KNOWN_NH | FLOWI_FLAG_SKIP_RAW;
 
 	if (ipc6.tclass < 0)
 		ipc6.tclass = np->tclass;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index e74e0361fd92..beae0bd61738 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2226,6 +2226,7 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
 	if (rt) {
 		goto out;
 	} else if (unlikely((fl6->flowi6_flags & FLOWI_FLAG_KNOWN_NH) &&
+			    !(fl6->flowi6_flags & FLOWI_FLAG_SKIP_RAW) &&
 			    !res.nh->fib_nh_gw_family)) {
 		/* Create a RTF_CACHE clone which will not be
 		 * owned by the fib6 tree.  It is for the special case where
-- 
2.31.1


             reply	other threads:[~2022-12-18 23:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-18 23:48 Jon Maxwell [this message]
2022-12-20 12:35 ` [net-next] ipv6: fix routing cache overflow for raw sockets Paolo Abeni
2022-12-20 15:10   ` David Ahern
2022-12-20 21:55     ` Jonathan Maxwell
2022-12-21  4:31       ` Jonathan Maxwell
2022-12-22  5:39         ` Jonathan Maxwell
2022-12-22 16:17           ` David Ahern
2022-12-22 22:36             ` Jonathan Maxwell
2022-12-20 15:17   ` Julian Anastasov
2022-12-20 15:41   ` Julian Anastasov
2022-12-20 21:48   ` Jonathan Maxwell
2022-12-23 20:28     ` Andrea Mayer
2022-12-24  7:38       ` Jonathan Maxwell
2023-01-02 23:59         ` Jonathan Maxwell
2023-01-03 16:07           ` Andrea Mayer
2023-01-06 23:26             ` Andrea Mayer
2023-01-07 23:46               ` Jonathan Maxwell
2023-01-08 17:34                 ` Andrea Mayer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221218234801.579114-1-jmaxwell37@gmail.com \
    --to=jmaxwell37@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox